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Preface 


This book is based on notes compiled during the many years I taught the course “‘Ap- 
plied Functional Analysis” in the first year of the master’s programme at Delft Uni- 
versity of Technology, for students with prior exposure to the basics of Real Analysis 
and the theory of Lebesgue integration. Starting with the basic results of the subject 
covered in a typical Functional Analysis course, the text progresses towards a treatment 
of several advanced topics, including Fredholm theory, boundary value problems, form 
methods, semigroup theory, trace formulas, and some mathematical aspects of Quantum 
Mechanics. With a few exceptions in the later chapters, complete and detailed proofs 
are given throughout. This makes the text ideally suited for students wishing to enter 
the field. 

Great care has been taken to present the various topics in a connected and integrated 
way, and to illustrate abstract results with concrete (and sometimes nontrivial) appli- 
cations. For example, after introducing Banach spaces and discussing some of their 
abstract properties, a substantial chapter is devoted to the study of the classical Banach 
spaces C(K), L?(Q), M(Q), with some emphasis on compactness, density, and approxi- 
mation techniques. The abstract material in the chapter on duality is complemented by a 
number of nontrivial applications, such as a characterisation of translation-invariant sub- 
spaces of L!(IR¢) and Prokhorov’s theorem about weak convergence of probability mea- 
sures. The chapter on bounded operators contains a discussion of the Fourier transform 
and the Hilbert transform, and includes proofs of the Riesz—Thorin and Marcinkiewicz 
interpolation theorems. After the introduction of the Laplace operator as a closable op- 
erator in L?, its closure A is revisited in later chapters from different points of view: as 
the operator arising from a suitable sesquilinear form, as the operator —V*V with its 
natural domain, and as the generator of the heat semigroup. In parallel, the theory of its 
Gaussian analogue, the Ornstein—Uhlenbeck operator, is developed and the connection 
with orthogonal polynomials and the quantum harmonic oscillator is established. The 
chapter on semigroup theory, besides developing the general theory, includes a detailed 
treatment of some important examples such as the heat semigroup, the Poisson semi- 
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group, the Schrédinger group, and the wave group. By presenting the material in this 
integrated manner, it is hoped that the reader will appreciate Functional Analysis as a 
subject that, besides having its own depth and beauty, is deeply connected with other 
areas of Mathematics and Mathematical Physics. 

In order to contain this already lengthy text within reasonable bounds, some choices 
had to be made. Relatively abstract subjects such as topological vector spaces, Banach 
algebras, and C*-algebras are not covered. Weak topologies are introduced ad hoc, the 
use of distributions in the treatment of weak derivatives is avoided, and the theory of 
Sobolev spaces is developed only to the extent needed for the treatment of boundary 
value problems, form methods, and semigroups. The chapter on states and observables 
in Quantum Mechanics is phrased in the language of Hilbert space operators. 

A work like this makes no claim to originality and most of the results presented here 
belong to the core of the subject. Not just the statements, but often their proofs too, are 
part of the established canon. Most are taken from, or represent minor variations of, 
proofs in the many excellent Functional Analysis textbooks in print. 

Special thanks go to my students, to whom I dedicate this work. Teaching them 
has always been a great source of inspiration. Arjan Cornelissen, Bart van Gisbergen, 
Sigur Gouwens, Tom van Groeningen, Sean Harris, Sasha Ivlev, Rik Ledoux, Yuchen 
Liao, Eva Maquelin, Garazi Muguruza, Christopher Reichling, Floris Roodenburg, Max 
Sauerbrey, Cynthia Slotboom, Joop Vermeulen, Matthijs Vernooij, Anouk Wisse, and 
Timo Wortelboer pointed out many misprints and more serious errors in earlier ver- 
sions of this manuscript. The responsibility for any remaining ones is of course with 
me. A list with errata will be maintained on my personal webpage. I thank Emiel Lorist, 
Lukas Miaskiwskyi, and Ivan Yaroslavtsev for suggesting some interesting problems, 
Jock Annelle and Jay Kangel for typographical comments, and Francesca Arici, Martijn 
Caspers, Tom ter Elst, Markus Haase, Bas Janssens, Kristin Kirchner, Klaas Landsman, 
Ben de Pagter, Pierre Portal, Fedor Sukochev, Walter van Suijlekom, and Mark Veraar 
for helpful discussions and valuable suggestions. 

A significant portion of this book was written in the extraordinary circumstances of 
the global pandemic. The sudden decrease in overhead and the opportunity of work- 
ing from home created the time and serenity needed for this project. Paraphrasing the 
epilogue of W. F. Hermans’s novel Onder Professoren (Among Professors), the book 
was written entirely in the hours otherwise spent on departmental meetings, committee 
meetings, evaluations, accreditations, visitations, midterms, reviews, previews, etcetera, 
and so forth. All that precious time has been spent in a very useful way by the author. 


Delft, March 2022 


Notation and Conventions 


We write N = {0,1,2,...} for the set of nonnegative integers, and Z, Q, R, and C for 
the sets of integer, rational, real, and complex numbers. Whenever a statement is valid 
both over the real and complex scalar field we use the symbol K to denote either R or C. 
Given a complex number z = a+ bi with a,b € R, we denote by z = a — bi its complex 
conjugate and by Rez = a and Imz = bits real and imaginary parts. We use the symbols 
D and T for the open unit disc and the unit circle in the complex plane, respectively. 
The indicator function of a set A is denoted by 1,. In the context of metric and normed 
spaces, B(x;r) denotes the open ball with radius r centred at x. The interior and closure 
of a set S are denoted by S° and S, respectively. We write S’ C S to express that S$’ is a 
subset of S. The complement of a set S is denoted by CS when the larger ambient set, 
of which S is a subset, is understood. We write |x| both for the absolute value of a real 
number x € R, the modulus of a complex number x € C, and the euclidean norm of an 
element x = (x1,...,x%¢) € K4@ When dealing with functions f defined on some domain 
f, we write f =c on SC Dif f(x) =c for all x € S. The null space and range of a 
linear operator A are denoted by N(A) and R(A) respectively. When A is unbounded, its 
domain is denoted by D(A). A comprehensive list of symbols is contained in the index. 

Unless explicitly otherwise stated, the symbols X and Y denote Banach spaces and H 
and K Hilbert spaces. In order to avoid frequent repetitions in the statements of results, 
these spaces are always thought of as being given and fixed. Conventions with this re- 
gard are usually stated at the beginning of a chapter or, in some cases, at the beginning 
of a section. The same pertains to the choice of scalar field. In Chapters 1-5, the scalar 
field K can be either R or C, with a small number of exceptions where this is explicitly 
stated, such as in our treatment of the Hahn—Banach theorem, the Fourier transform, and 
the Hilbert transform. From Chapter 6 onwards, spectral theory and Fourier transforms 
are used extensively and the default choice of scalar field is C. In many cases, however, 
statements not explicitly involving complex numbers or constructions involving them 
admit counterparts over the real scalars which can be obtained by simple complexifica- 
tion arguments. We leave it to the interested reader to check this in particular instances. 
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Banach Spaces 


The foundations of modern Analysis were laid in the early decades of the twentieth 
century, through the work of Maurice Fréchet, Ivar Fredholm, David Hilbert, Henri 
Lebesgue, Frigyes Riesz, and many others. These authors realised that it is fruitful to 
study linear operations in a setting of abstract spaces endowed with further structure to 
accommodate the notions of convergence and continuity. This led to the introduction of 
abstract topological and metric spaces and, when combined with linearity, of topological 
vector spaces, Hilbert spaces, and Banach spaces. Since then, these spaces have played 
a prominent role in all branches of Analysis. 

The main impetus came from the study of or- 
dinary and partial differential equations where 
linearity is an essential ingredient, as evidenced 
by the linearity of the main operations involved: 
point evaluations, integrals, and derivatives. It 
was discovered that many theorems known at 
the time, such as existence and uniqueness re- 
sults for ordinary differential equations and the 
Fredholm alternative for integral equations, can 
be conveniently abstracted into general theorems 
about linear operators in infinite-dimensional 
spaces of functions. 

A second source of inspiration was the discov- 
ery, in the 1920s by John von Neumann, that the 
— at that time brand new — theory of Quantum Mechanics can be put on a solid math- 
ematical foundation by means of the spectral theory of selfadjoint operators on Hilbert 
spaces. It was not until the 1930s that these two lines of mathematical thinking were 
brought together in the theory of Banach spaces, named after its creator Stefan Banach 
(although this class of spaces was also discovered, independently and about the same 
time, by Norbert Wiener). This theory provides a unified perspective on Hilbert spaces 


Stefan Banach, 1898-1945 
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and the various spaces of functions encountered in Analysis, including the spaces C(K) 
of continuous functions and the spaces L?(Q) of Lebesgue integrable functions. 


1.1 Banach Spaces 


The aim of the present chapter is to introduce the class of Banach spaces and dis- 
cuss some elementary properties of these spaces. The main classical examples are only 
briefly mentioned here; a more detailed treatment is deferred to the next two chapters. 
Much of the general theory applies to both the real and complex scalar field. Whenever 
this applies, the symbol K is used to denote the scalar field, which is R in the case of 
real vector spaces and C in the case of complex vector spaces. 


1.1.a Definition and General Properties 


Definition 1.1 (Norms). A normed space is a pair (X,||- ||), where X is a vector space 
over K and || - || : X — [0,°9) is a norm, that is, a mapping with the following properties: 


(i) ||x|| = 0 implies x = 0; 
(ii) ||cx|| = |c| ||x|| for allc € K andxe€ Xx; 
(iii) [|x +2/|| < ||x|| + ||2’|| for all x,x’ € X. 


When the norm || - || is understood we simply write X instead of (X, || - ||). If we wish 
to emphasise the role of X we write || - || instead of || - |]. 

The properties (ii) and (iii) are referred to as scalar homogeneity and the triangle 
inequality. The triangle inequality implies that every normed space is a metric space, 
with distance function 


d(x,y) = |lx—yll. 


This observation allows us to introduce metric notions such as openness, closedness, 
compactness, denseness, limits, convergence, completeness, and continuity in the con- 
text of normed spaces by carrying them over from the theory of metric spaces. For 
instance, a sequence (X»)n>1 in X is said to converge if there exists an element x € X 
such that limy—.0 ||%n — x|| = 0. This element, if it exists, is unique and is called the limit 
of the sequence (x; )n>1. We then write lim,_,.0.X, =x or simply ‘x, > x as n + 0’. 

The triangle inequality (ii) implies both ||x|| — ||x’|| < ||x—.’|| and ||x’|| — |x|] < |lx’ — 
x||. Since ||x’ — x|] = ||(—1)- («— x’) || = ||x —’]| by scalar homogeneity, we obtain the 
reverse triangle inequality 


[Ike = Ihe'TI] < |e". 


It shows that taking norms x + ||x|| is a continuous operation. 
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If limyooX, = x and limy+.%, =x’ in X and c € K is a scalar, then ||cx, — cx|| = 
I|¢(% —x)|| = le|||¥n — x|| implies 


lim ||cxp — cx|| = 0. 
n—-oo 
Likewise, ll (in +¥,) — (x+2)Il = [IG —2) + (0% —)II < [tn a +l, —2l] implies 
lim ll (%n-+24) — (¢-+)|| =0. 


This proves sequential continuity, and hence continuity, of the vector space operations. 
Throughout this work we use the notation 


B(xo;r) := {x EX: ||x—xol| <r} 
for the open ball centred at x9 € X with radius r > 0, and 
B(xosr) = {x EX : ||x—xo|| <r} 
for the corresponding closed ball. The open unit ball and closed unit ball are the balls 
By := B(0;1) = {xEX: ||x|| <1}, By :=B(0;1) = {xEX: ||x|| < 1}. 
Definition 1.2 (Banach spaces). A Banach space is a complete normed space. 


Thus a Banach space is a normed space X in which every Cauchy sequence is con- 
vergent, that is, lim». ||%n —Xm|| = 0 implies the existence of an x € X such that 
Lim —00 |X — x|] = 0. 

The following proposition gives a necessary and sufficient condition for a normed 
space to be a Banach space. We need the following terminology. Given a sequence 
(Xn)n>1 in a normed space X, the sum 5; Xn is said to be convergent, with sum x € X, 
if 

N 
fim ||x— YY xf] =0. 
n=1 


N-co 


The sum )°,,51 Xn is said to be absolutely convergent if Y.,, ||Xn|| <°. 


Proposition 1.3. A normed space X is a Banach space if and only if every absolutely 
convergent sum in X converges in X. 


Proof ‘Only if’: Suppose that X is complete and let ¥;,,5 x, be absolutely convergent. 
Then the sequence of partial sums ()7_; xj)n>1 is a Cauchy sequence, for if n > m the 
triangle inequality implies 


|Z-La=| 2 als 2 bt 


j=m+1 
which tends to 0 as m,n — o9. Hence, by completeness, the sum 51 Xn converges. 
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‘If’: Suppose that every absolutely convergent sum in X converges in X, and let 
(Xn)n>1 be a Cauchy sequence in X. We must prove that (%,)n>1 eonvetec: in X. 

Choose indices nj < nz <... in such a way that ||x; — x;| oe ; for all i,j > ng, 
k=1,2,... The sum xn, + Yesi(%n.; —%n,) is absolutely aeveuct since 


1 
3S IlXng.1 — Xn, || < » 2k <20 
k>1 k>1 


By assumption it converges to some x € X. Then, by cancellation, 


m 
x= lim (x 4 te —x, )) = tim x 
Eenerestt Aid| £ Mk+1 ny) myo Tmt? 


and therefore the subsequence (X;,, )m>1 is convergent, with limit x. To see that (%,)n>1 
converges to x, we note that 


I|Xm — XI] < []%m — Xnmll + [42m — x1] + 0 


as m — co (the first term since we started from a Cauchy sequence and the second term 
by what we just proved). 


The next theorem asserts that every normed space can be completed to a Banach 
space. For the rigorous formulation of this result we need the following terminology. 


Definition 1.4 (Isometries). A linear mapping T from a normed space X into a normed 
space Y is said to be an isometry if it preserves norms. A normed space X is isometrically 
contained in a normed space Y if there exists an isometry from X into Y. 


Theorem 1.5 (Completion). Let X be a normed space. Then: 


(1) there exists a Banach space X containing X isometrically as a dense subspace; 

(2) the space X is unique up to isometry in the following sense: If X is isometrically 
contained as a dense subspace in the Banach spaces X and Xx, then the identity 
mapping on X has a unique extension to an isometry from X onto X. 


Proof Asametric space, X = (X,d) has a completion X = (X,d) by Theorem D.6. We 
prove that X is a Banach space in a natural way, with a norm || - ||~ such that d(x,x’) = 
|x — x’ ||. The properties (1) and (2) then follow from the corresponding assertions for 
metric spaces. 

Recall that the completion X of X, as a metric space, is defined as the set of all 
equivalence classes of Cauchy sequences in X, declaring the Cauchy sequences (Xn) n>1 
and (x/,)n>1 to be equivalent if limp .0d(%n,x),) = limy +. ||xn — x, || = 0. The space X 
is a vector space under the scalar multiplication 


c[(%n)n>1] = [eXn)nz1] 
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and addition 
[(%n) m1] + [(%n) mei] = [(%n +2) nz]; 


where the brackets denote the equivalence class. 

If (Xn)n>1 is a Cauchy sequence in X, the reverse triangle inequality implies that the 
nonnegative sequence (||x,||)n>1 is Cauchy, and hence convergent by the completeness 
of the real numbers. We now define a norm on X by 


6%») lar = lim fx 


Denoting by d the metric on X given by d(x, x’) := limy 500d (Xn, x/,), where x = (Xp)n>1 
and x’ = (x,,)n>1, it is clear that d(x,x’) = ||x —’ ||. 


1.1.b Subspaces, Quotients, and Direct Sums 


Several abstract constructions enable us to create new Banach spaces from given ones. 
We take a brief look at the three most basic constructions, namely, passing to closed 
subspaces and quotients and building direct sums. 


Subspaces A subspace Y of a normed space X is a normed space with respect to the 
norm inherited from X. A subspace Y of a Banach space X is a Banach space with 
respect to the norm inherited from X if and only if Y is closed in X. 

To prove the ‘if’ part, suppose that (y,)n51 is a Cauchy sequence in the closed sub- 
space Y of a Banach space X. Then it has a limit in X, by the completeness of X, and 
this limit belongs to Y, by the closedness of Y. The proof of the ‘only if” part is equally 
simple and does not require X to be complete. If (y,)n>1 is a sequence in the complete 
subspace Y such that y, — x in X, then (yn)n>1 is a Cauchy sequence in X, hence also 
in Y, and therefore it has a limit y in Y, by the completeness of Y. Since (yn)n>1 also 
converges to y in X, it follows that y = x and therefore x € Y. 


Quotients If Y is a closed subspace of a Banach space X, the quotient space X/Y can 
be endowed with a norm by 
:= inf ||x— 
Cell == inf [lsh 
where for brevity we write [x] :=x-+Y for the equivalence class of x modulo Y. Let us 
check that this indeed defines a norm. If ||[x]|| = 0, then there is a sequence (y,))>1 in 
Y such that ||x—ypl| < i for alln > 1. Then 


1 1 
Ilyn — Yl] < [l¥n— xl + lx-ymll <= + —, 
nom 
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SO (Yn)n>1 is a Cauchy sequence in X. It has a limit y € X since X is complete, and we 
have y €Y since Y is closed. Then ||x— y|| = limy-,. ||_x— yn || = 0, so x = y. This implies 
that [x] = [y] = [0], the zero element of X/Y. The identity ||c[x]|] = |c|||[2]|| is trivially 
verified, and so is the triangle inequality. 

To see that the normed space X/Y is complete we use the completeness of X and 
Proposition 1.3. If Y,31 ||[%n]|| <e and the y, € Y are such that ||x, — yy|| < ||[xn]||, the 
proposition implies that ¥°,,51 (vn — Xn) converges in X, say to x. Then, for all N > 1, 


[P-L b- 2+ Lo] =f- (La) 


As N - 9, the right-hand side tends to 0 and therefore limy—y.0 ¥¥_, [Xn] = [x] in X/Y. 


N 


|e1— Lb 


Direct Sums A product norm on a finite cartesian product X = X, x --- x Xy of normed 
spaces is a norm ||- || satisfying 


|(0,...,0, n ,0,...,0)]| = |lxnll < ||@1,.--,xN)]| 
n—th 
for all x = (x1,...,xv) €X andn=1,...,N. For instance, every norm |- | on K’ assign- 
ing norm one to the standard unit vectors induces a product norm on X by the formula 
IGcr, sx I= [bz lls xv) (1) 


As anormed space endowed with a product norm, the cartesian product will be denoted 
X=X,0---PXy 


and called a direct sum of X,,...,Xy. If every X, is a Banach space, then the normed 
space X is a Banach space. Indeed, from 


N 


N 

la =] ,---.0,49,04--.,0)]] < Yo lanl <a (1.2) 
n=1 n=1 

we see that a sequence (x) )ez1 in X is Cauchy if and only if all its coordinate sequences 

(x) 1 are Cauchy. If the spaces X, are complete, these coordinate sequences have 

limits x, in X,, and these limits serve as the coordinates of an element x = (x1,...,xn) 

in X which is the limit of the sequence (x“));,5. 


1.1.c First Examples 


The purpose of this brief section is to present a first catalogue of Banach spaces. The 
presentation is not self-contained; the examples will be revisited in more detail in the 
next chapter, where the relevant terminology is introduced and proofs are given. 


1.1 Banach Spaces 7 


Figure 1.1 The open unit balls of R? with respect to the norms || - ||, ||- 


- |I-o. 


2> 


Example 1.6 (Euclidean spaces). On K¢ we may consider the euclidean norm 


d 1/2 
Mae 2 
lala = (2 la) 


and more generally the p-norms 


a 1/p 
lallp:=(lai’) , 1<p<e, 
j=l 


as well as the supremum norm 


Ila||o-= sup |ajl. 


SIS 


It is not immediately obvious that the p-norms are indeed norms; the triangle inequal- 
ity ||Ja+)]|p < |al|p + ||O||p will be proved in the next chapter. It is an easy matter to 
check that the above norms are all equivalent in the sense defined in the next section. 

In what follows the euclidean norm of an element x € K¢ is denoted by |x| instead of 
the more cumbersome ||x||2. 


Example 1.7 (Sequence spaces). Thinking of elements of K@ as finite sequences, the 
preceding example may be generalised to infinite sequences as follows. For 1 < p < 
the space £? is defined as the space of all scalar sequences a = (ay) ,>1 Satisfying 


1/p 
lll = (Lawl?) <=. 
k>1 
The mapping a+> |la||, is a norm which turns £? into a Banach space. The space €* of 


all bounded scalar sequences a = (ax)¢>1 is a Banach space with respect to the norm 


\|a||oo := sup |az| < 0. 
The space cg consisting of all bounded scalar sequences a = (ag)x>1 Satisfying 


lim an = 0 
k—yoo 
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Figure 1.2 The open ball B(f;1) in C[0, 1] consists of all functions in C[0, 1] whose graph 
lies inside the shaded area. 


is a closed subspace of ¢*. As such it is a Banach space in its own right. 


Example 1.8 (Spaces of continuous functions). Let K be a compact topological space. 
The space C(K) of all continuous functions f : K  K is a Banach space with respect 
to the supremum norm 


|| flleo := sup |f(x)].- 
xe€K 


This norm captures the notion of uniform convergence: for functions in C(K) we have 
ll fn — f |leo = 0 if and only if limy_,.. fy = f uniformly. 


Example 1.9 (Spaces of integrable functions). Let (Q,.¥,) be a measure space. For 
1 < p<, the space L?(Q) consisting of all measurable functions f : Q— K such that 


Ifln== (f Leiran)” <=, 


identifying functions that are equal U-almost everywhere, is a Banach space with respect 
to the norm || - ||,. The space L*(Q) consisting of all measurable and p1-essentially 
bounded functions f : Q — K, identifying functions that are equal U-almost everywhere, 
is a Banach space with respect to the norm given by the p-essential supremum 


|| fll-o := H-esssup|f(@)| :=inf{r > 0: |f| <r u-almost everywhere}. 
acQ 


Example 1.10 (Spaces of measures). Let (Q,.#) be a measurable space. The space 
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M(Q) consisting of all K-valued measures of bounded variation on (Q,.¥) is a Banach 
space with respect to the variation norm 


[|e I| = |w|(Q) = sup Y° |u(A)I, 
AEF ACA 


where F denotes the set of all finite collections of disjoint #-measurable subsets of Q. 


Example 1.11 (Hilbert spaces), A Hilbert space is an inner product space (H, (-|-)) that 
is complete with respect to the norm 


||Al| := (alhy'/?. 


Examples include the spaces K¢ with the euclidean norm, ¢’, and the spaces L?7(Q). 
Further examples will be given in later chapters. 


1.1.d Separability 


Most Banach spaces of interest in Analysis are infinite-dimensional in the sense that 
they do not have a finite spanning set. In this context the following definition is often 
useful. 


Definition 1.12 (Separability). A normed space is called separable if it contains a 
countable set whose linear span is dense. 


Proposition 1.13. A normed space X is separable if and only if X contains a countable 
dense set. 


Proof The ‘only if’ part is trivial. To prove the ‘if’ part, let (x,)n>1 have dense span 
in. Let Q be a countable dense set in K (for example, one could take Q=QifK=R 
and O = Q+iQ if K =C). Then the set of all Q-linear combinations of the x,, that is, 
all linear combinations involving coefficients from Q, is dense in X. 


Finite-dimensional spaces, the sequence spaces co and ¢? with 1 < p < , the spaces 
C(K) with K compact metric, and L?(D) with 1 < p<eoand DC R?¢ open, are sepa- 
rable. The separability of C(K) and L?(D) follows from the results proved in the next 
chapter. 


1.2 Bounded Operators 


Having introduced normed spaces and Banach spaces, we now introduce a class of linear 
operators acting between them which interact with the norm in a meaningful way. 
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1.2.a Definition and General Properties 


Let X and Y be normed spaces. 


Definition 1.14 (Bounded operators). A linear operator T : X — Y is bounded if there 
exists a finite constant C > O such that 


|Tx|| <Cl|x|], «eX. 


Here, and in the rest of this work, we write Tx instead of the more cumbersome T (x). 
A bounded operator is a linear operator that is bounded. 


The infimum Cr of all admissible constants C in Definition 1.14 is itself admissible. 
Thus Cy is the least admissible constant. We claim that it equals the number 
||| := sup ||Tx|| 
IIxl|<1 
To see this, let C be an admissible constant in Definition 1.14, that is, we assume that 
|| 7x|| < C]|x|| for all x € X. Then ||7|| = supy.,j<; ||7x|| < C. This being true for all 


admissible contants C, it follows that ||7'|| < Cr. The opposite inequality Cr < ||T]| 
follows by observing that for all x € X we have 


ITI] < TIN), 


which means that ||7'|| an admissible constant. This inequality is trivial for x = 0, and 
for x £ 0 it follows from scalar homogeneity, the linearity of T and the definition of the 
number ||7'||: 


(7h =| pg P*|[lell =|] lin < IH. 


Proposition 1.15. For a linear operator T : X — Y the following assertions are equiv- 
alent: 


(1) T is bounded; 
(2) T is continuous; 


(3) T is continuous at some point xo € X. 
Proof The implication (1)=(2) follows from 
||Tx—Tx'|| = ||T(@—x)|| < IIT I|IN]x—-2'| 


and the implication (2)=>(3) is trivial. To prove the implication (3)=(1), suppose that 
T is continuous at xo. Then there exists a 6 > 0 such that ||xo — y|| < 6 implies ||7xo — 
Ty|| < 1. Since every x € X with ||x|| <6 is of the form x = x9 — y with ||x9 — y|| < 6 
(take y = x9 — x) and T is linear, it follows that ||x|| < 6 implies ||7x|| < 1. By scalar 
homogeneity and the linearity of T we may scale both sides with a factor 6, and obtain 
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that ||x|| < 1 implies ||7x|| < 1/6. From this, and the continuity of x + ||x||, it follows 
that ||x|| < 1 implies ||7x|| < 1/6, that is, T is bounded and ||T|| < 1/6. 


Easy manipulations involving the properties of norms and linear operators, such as 
those used in the above proofs, will henceforth be omitted. 

The set of all bounded operators from X to Y is a vector space in a natural way with 
respect to pointwise scalar multiplication and addition by putting 


(cT)x:=c(Tx), (T+T')x:=Tx4+Tx’. 


This vector space will be denoted by “(X,Y). We further write 2(X) := 2(X,X). 
For all T,T € @(X,Y) and c € K we have 


le] =lelllTI, +7 <ITN+17 1 


Let us prove the second assertion; the proof of the first is similar. For all x € X, the 
triangle inequality gives 


I(T +7 )x|| < [Px] +1711 < (ITF ITIL 


and the result follows by taking the supremum over all x € X with ||x|| < 1. 

Noting that ||7'|| = 0 implies T = 0, it follows that T +> ||T|| is a norm on 2(X,Y). 
Endowed with this norm, (X,Y) is a normed space. If T: X — Y and S: Y > Z are 
bounded, then so is their composition ST and we have 


IST || < |IS|I|7I- 
Indeed, for all x € X we have 
I|STxl] < [S| Fx] < [SIT MTP 
and the result follows by taking the supremum over all x € X. 
Proposition 1.16. [fY is complete, then £(X,Y) is complete. 


Proof Let (Tn)n>1 be a Cauchy sequence in (X,Y). From ||T,x — Tnx|| < ||Th - 
Tn||||x|| we see that (Z,x)n>1 is a Cauchy sequence in Y for every x € X. Let Tx denote 
its limit. The linearity of each of the operators 7, implies that the mapping T : x > 
Tx is linear and we have ||Tx|| = limp. || Tnx|| < M||x||, where M := sup, 5, ||Thl| is 
finite since Cauchy sequences in normed spaces are bounded. This shows that the linear 
operator T is bounded, so it is an element of (X,Y). To prove that limy—.. || Tn — T || = 
0, fix € > 0 and let N > 1 be so large that ||T,, — Tin|| < € for all m,n > N. Then, for 
m,n > N, from 


\|Znx — Tmx|| < €||-x|| 
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it follows, upon letting m — ©, that 


\| Tnx — Tx|| < €l]-x||- 


This being true for all x € X andn > N, it follows that ||T, — T|| < € for alln > N. 


The important special case Y = K leads to the following definition. 
Definition 1.17. The dual space of a normed space X is the Banach space 
X* := £(X,K). 


The elements of the dual space X* are often referred to as bounded functionals or 
simply functionals. Duality is a subject in its own right which will be taken up in Chapter 
4. In that chapter, explicit representations of duals of several classical Banach spaces are 
given. For Hilbert spaces this duality takes a particularly simple form, described by the 
Riesz representation theorem, to be proved in Chapter 3. 

It often happens that a linear operator can be shown to be well defined and bounded on 
a dense subspace. In such cases, a density argument can be used to extend the operator 
to the whole space. 


Proposition 1.18 (Density argument — extending operators). Let X be a normed space 
and Y be a Banach space, and let Xo be a dense subspace of X. If Ty : Xp > Y is a 
bounded operator, there exists a unique bounded operator T : X — Y extending Tp. The 
norm of this extension satisfies ||T || = ||To||. 


Proof Fix x € X, and suppose that lim,-,.%, = x with x, € Xo for all n > 1. The 
boundedness of 7p implies that ||7x» — ToxXm|| < ||Tol|||%n — Xm|| 4 0 as m,n > o, so 
(ToxXn)n>1 is a Cauchy sequence in Y. Since Y is complete, we have Toxn — y for some 
yey. 

If also xj, + x, the same argument shows that Tox’, + y’ for some (possibly different) 
y' €Y. From 


||Zoxp — Toxnll < || Tall l&n = %nll < || Zoll (ln — ll + lle sal) 
it follows that 
IIy’ — yl] = lim ||Zox, — Toxn|| = 0 
n—-o0o 

and therefore y’ = y. 

Denoting the common limit y = y by Tx, we thus obtain a well defined mapping 
x++ Tx. It is evident that this mapping extends 7p, for if x € Xp) we may take x, = x and 
then Tx = limps Tox%n = Tox. 


It is easily checked that T is linear. To show that it is bounded, with ||7|| < ||Zo||, we 
just note that 


| 7x] = lim ||Toxn|| <||To|| tion lbxnl] = | Toll 
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The converse inequality ||7'|| > ||Zo|| trivially holds since T extends 7p. 
Finally, if the bounded operators T and T’ both extend Tp, then the bounded operator 
T —T' equals 0 on the dense subspace Xo and hence, by continuity, on all of X. 


Under a uniform boundedness assumption, a similar density argument can be used to 
extend the existence of limits from a dense subspace to the whole space. 


Proposition 1.19 (Density argument — extending convergence of operators). Let X be 
a normed space and Y a Banach space, and let Xo be a dense subspace of X. Let 
(Tn)n>1 be a sequence of operators in .2(X,Y) satisfying sup,>, ||Tn|| <0. If the limit 
limy—oo T,xX0 exists in Y for all xg € Xo, then the limit Tx := limy-,.0T,x exists in Y for 
all x € X. Moreover, the operator T :x++ Tx is linear and bounded from X to Y, and 


|7|| < limin€ ZI). 
n-oo 


Proof We will show that the sequence (7;,x)n>1 is Cauchy for every x € X. Fix arbitrary 
x € X and € > 0 and choose xo € Xo in such a way that ||x —xo|| < €/M, where M := 
sup, ||Tn||. Since (7x0 )n>1 18 a Cauchy sequence, there is an N > 1 such that ||7;,x9 — 
TinXo|| < € for all m,n > N. Then, for all m,n > N, 


|| Tx — Tn|| || Tnx T,Xo|| || Tnx0 TnXo| t || TinxXo TinX|| 


M||x—xo|| +€ +M|lxo —x|| < 3e. 


IN IN 


The sequence (7;,x)n>1 is thus Cauchy. Since Y is complete this sequence has a limit, 
which we denote by Tx. Linearity of T : x++ Tx is clear, and boundedness along with 
the estimate for the norm follow from 


\|7x|| = lim ||Zx|| = liming || Z,x|| < liming ||Z|\||-)- 
n—oo n—oo noo 


This proposition should be compared with Proposition 5.3, which provides the fol- 
lowing partial converse: if X is a Banach space, Y is a normed space, and (Th)n>1 
is a sequence in (X,Y) such that Tx := limy_,..7,x exists in Y for all x € X, then 
SUP, >1 ||Tn|] < &- 


Definition 1.20 (Null space and range). The null space of a bounded operator T € 
-£(X,Y) is the subspace 


N(T) :={xEX: Tx =0}. 
The range of T is the subspace 


R(T) :={Tx: xe X}. 
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By linearity, both the null space N(7) and the range R(T) are subspaces. By continu- 
ity, the null space of a bounded operator is closed. The following result gives a useful 
sufficient criterion for the range of a bounded operator to be closed. 


Proposition 1.21. Let X be a Banach space and Y be anormed space. If T € (X,Y) 
satisfies ||Tx|| > C\|x|| for some C > 0 and all x € X, then T is injective and has closed 
range. 


Proof Injectivity is clear. Suppose that Tx, — y in Y; we must prove that y € R(T). 
From ||x, —xm|| <C7!||Txn — Txm|| it follows that (x,)n>1 is a Cauchy sequence in X 
and therefore converges to some x € X. Then y = limy_5.. TX, = Tx. 


We conclude by introducing some terminology that will be used throughout this work. 
In the next four definitions, X and Y are normed spaces. 


Definition 1.22 (Isomorphisms). An isomorphism is a bijective operator T € &(X,Y) 
whose inverse is bounded as well. An isometric isomorphism is an isomorphism that is 
also isometric. The spaces X and Y are called (isometrically) isomorphic if there exists 
an (isometric) isomorphism from X to Y. 


Definition 1.23 (Contractions). A contraction is an operator T € #(X,Y) satisfying 
ITI <1. 


Definition 1.24 (Uniform boundedness). A subset 7 of £(X,Y) is said to be uniformly 
bounded if it is a bounded subset of @(X,Y), ie., if supre z ||T|| < ©. 


Definition 1.25 (Uniform, strong, and weak convergence of operators). A sequence 
(Tn)n>i in L(X,Y) is said to: 


(1) converge uniformly to an operator T € 2(X,Y) if 
lim ||T, — T|| = 0; 
n—-oo 
(2) converge strongly to an operator T € 2(X,Y) if 
lim ||T,x—Tx||=0, xe X; 
n—-0o 
(3) converge weakly to an operator T € Y(X,Y) if 
tim (Tnx — Tx,y)=0, xeEx, yer’, 


where Y* is the dual of Y. In these situations we call T the uniform limit, respectively 
the strong limit, respectively the weak limit, of the sequence (T,)n>1- 


Uniform convergence implies strong convergence and strong convergence implies 
weak convergence, but the converses generally fail. For instance, the projections onto 
the first n coordinates in £”, 1 < p < c», converge strongly to the identity operator, but not 
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uniformly; and the operators 7”, where T is the right shift in 2”, 1 < p < o, converges 
weakly to the zero operator but not strongly (for the case p = 1 see Problem 4.33). 


1.2.b Subspaces, Quotients, and Direct Sums 


Restrictions If T is a bounded operator from a normed space X into a normed space Y, 
then the restriction of T to a subspace Xo of X defines a bounded operator T|x, from Xo 
into Y of norm ||T|x,|| < ||7']. 


Quotients Let Y be a closed subspace of a Banach space X. By the definition of the 
quotient norm, the quotient map q:x4x+/Y is bounded from X to X/Y of norm 


llall <1. 
Let Z be a normed space and let T € “&(X,Z) be a bounded operator with the property 
that Y is contained in the null space N(7). We claim that 


Ty(x+Y):=Tx, xeEXx, 


defines a well-defined and bounded quotient operator Ty : X /Y — Z of norm ||T/y || = 
||7'||. Well-definedness of T/y is clear, and for all x € X and y € Y we have ||7x|| = 
|T (x+y)|| < ||T |||“ +y]]. Taking the infimum over all y € Y gives the bound 


IZ +Y)I| = [Pall < [PM inf lle + yll = ITM Ne + YI 
Hence Tyy is bounded and ||T/y|| < ||T||. For the converse inequality we note that 


Tal = UZ + YOM SWZ pv lll + YN = UZ I in lle — yl < [Zr 


Direct Sums If X,, is a normed space and T, € &(X,) forn =1,...,N, then the direct 
sum operator 


N 
T=Qr, : (x1,...,4n) > (Tix1,-.-, Tyxw) 


n=1 


is bounded on X = @*_, X,, with respect to any product norm; this follows from (1.2). 
If the product norm is of the form (1.1), then ||7'|] = maxi<n<v ||Thll. 


1.2.c First Examples 


We revisit the examples of Section 1.1.c and discuss how various natural operations 
used in Analysis give rise to bounded operators. 
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Example 1.26 (Matrices). Every m x n matrix A = (ajj);""" _j=1 defines a bounded operator 
in @(K",K”) and its norm satisfies 


n 


m n 
Al? = sup |Ax|2 = sup y Yavl <¥Y¥ lal (1.3) 
i=l j=l 


lx|<1li=1' j=1 


where the last step follows from the Cauchy—Schwarz inequality. More generally, every 
linear operator from a finite-dimensional normed space X into a normed space Y is 
bounded; this will be shown in Corollary 1.37. 


The upper bound (1.3) for the norm of a matrix A is not sharp. An explicit method to 
determine the operator norm of a matrix is described in Problem 4.14. 


Example 1.27 (Point evaluations). Let K be a compact topological space. For each 
xo € K the point evaluation E,, : f ++ f(xo) is bounded as an operator from C(K) into 
K with norm ||E,,|| = 1. Boundedness with norm ||E,, || < 1 follows from 


[Exo f| = |f(x0)| < sup|f(x)| = I[flle- 
xE€K 


By considering f = 1, the constant-one function on K, it is seen that ||E,,|| = 1. 


Example 1.28 (Integration). Let (Q,.¥, 1) be a measure space. The mapping J, : f + 
Jo. fd is bounded from L'(Q) to K with norm ||J,.|| = 1. Boundedness with norm 
\[Zu || < 1 follows from 


Mutl=| fi fan] < f lela = fib. 
Q Q 
By considering nonnegative functions it is seen that ||J,.|| = 1. 


Example 1.29 (Pointwise multipliers). Let (Q,.%, 1) be a measure space and fix 1 < 
p<. For any m € L*(Q), the pointwise multiplier T,, : f + mf defines a bounded 
operator on L?(Q,) with norm ||T,,|| = ||77||... Indeed, for 1-almost all @ € Q we have 


|(mf)(@)| = |m(@)||f(@)| < |lml}-0] f(@)]- 


For 1 < p < ©, upon integration we obtain 


Wnfile = fh mpl au <lionli fis? dx = Imi 


Tn is bounded on L?(Q) and ||Tin|| < ||m||... For p = ce the analogous bound follows by 
taking essential suprema. Equality ||T;,|| = ||77||.. is obtain by considering, for 0 <e <1, 
functions supported on measurable sets F, € ¥ where |m| > (1—€)||m||.. U-almost 
everywhere. 
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Example 1.30 (Integral operators). Let 1 be a finite Borel measure on a compact metric 
space K. With respect to the product metric d((s,t),(s’,t')) :=d(s,t)+d(s',t!), Kx K 
is a compact metric space (see Proposition D.13). Let k € C(K x K) and define, for 
f € C(K), the function Tf : K > K by 


= i k(s,t)f()du(t), s€K. 


Using the uniform continuity of k (see Theorem D.12), it is easy to see that Tf is a 
continuous function. Indeed, given € > 0, choose 6 > 0 so small that d((s,1), (s’,t’)) <6 
implies |k(s,t) — k(s’,t’)| < €. Then d(s,s’) < 6 implies 


ITF) <ef If@)ldu(e) < eu(k) fll 


As aresult, T acts as a linear operator on C(K). To prove boundedness, we estimate 


IT f(s) I< f Ik(s.nll4( IF) det) < HK) |All] flee. 


Taking the supremum over s € K, this results in 


IIT Flee < H(K) ||| lcoll fllee- 


It follows that T is bounded and ||T|| < U(K)|| fl... 

For kernels k € L°(K x K, x L) the same prescription defines a bounded opera- 
tor on L®(K,) satisfying the same estimate. If one takes k € L?(K x K,u x [L), this 
prescription gives a bounded operator T on L?(K, 1) satisfying 


ITF lle < allel flle- (1.4) 


Indeed, by the Cauchy—Schwarz inequality (its abstract version for Hilbert spaces will 
be proved in Chapter 3) and Fubini’s theorem we obtain 


[| [ kooro sda 
< ff lmls.oPaue)) (frau) dats) = IIIB 


and the claim follows. This inequality generalises the one of Example 1.26. 


Example 1.31 (Volterra operator). For all f € L7(0, 1), the Cauchy-Schwarz inequality 
implies that the indefinite integral 


a =f foe, s€ (0,1), 


is well defined and that |7 f(s) — Tf (s’)| < |s—s'|!/?|| f|lz for all s,s’ € (0, 1]. From this 
we infer that Tf € C[0, 1] and, by taking s’ = 0, that ||Tf||.. < ||f|/2. This implies that 
T is bounded from L7(0, 1) into C{0, 1] with norm ||T7|| < 1 
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For functions g € C[0, 1], the pointwise estimate |g(t)| < ||g||.. implies 


1 
[ e@Par<iiell. 


so g € L’(0,1) and ||g||2 < ||g||... This shows that C{0, 1] defines a subspace of L”(0, 1), 
with an inclusion mapping that is bounded of norm at most 1. Composing T with this in- 
clusion mapping, the indefinite integral can be viewed as a bounded operator on L7(0, 1) 
of norm at most 1. Interestingly, this norm bound is not sharp; it can be shown that the 
norm of this operator equals 2/2. This will be proved using the spectral theory of self- 
adjoint operators in Chapter 8. 


As this brief list of examples already shows, operators occurring naturally in Analysis 
have a tendency to be bounded. This raises the natural question whether linear operators 
acting between Banach spaces X and Y are always bounded. If one is willing to accept 
the Axiom of Choice the answer is negative, even for separable Hilbert spaces X and Y = 
K (see Problem 3.23). In Zermelo—Fraenkel Set Theory without the Axiom of Choice, 
it is consistent that every linear operator acting between Banach spaces is bounded. The 
reader is referred to the Notes to Chapter 3 for a further discussion of this topic. 


1.3 Finite-Dimensional Spaces 


The aim of this section is to prove that every finite-dimensional normed space is a 
Banach space. This will be deduced as an easy consequence of the fact that every two 
norms on a finite-dimensional normed space are equivalent, in the sense made precise 
in the next definition. 


Definition 1.32 (Equivalent norms). Two norms |] - || and ||| - ||] on a vector space X are 
equivalent if there exist constants 0 < c < C < © such that for all x € X we have 


is) 


llxll < lal < Cll]. 


Example 1.33. Any two product norms on the product X = X; x --- x Xy of normed 
spaces are equivalent. Indeed, (1.2) shows that every product norm on X is equivalent 
to the product norm ||x||y := Y_, ||xn|| on X. 


In the above situation we have the inclusions of open balls 


By. 37/C) S By. Qsr) S By.y(%7/c). 


Hence if two norms on a given vector space are equivalent the resulting normed spaces 
have the same open sets. This implies that topological notions such as openness, closed- 
ness, compactness, convergence, and so forth, are preserved under passing to an equiv- 
alent norm. 
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Theorem 1.34 (Equivalence of norms in finite dimensions). Every two norms on a 
finite-dimensional vector space are equivalent. 


Proof Let (X,||- ||) be a finite-dimensional normed space, say of dimension d, and let 
(x a1 be a basis for X. Relative to this basis, every x € X admits a unique representa- 


tion x = Ee c;x;. We may use this to define a norm || - ||2 on X by 


Serle (Eler)"” 


The theorem follows once we have shown that the norms || - || and || - ||2 are equivalent. 
Let M = max) <j<q ||x;||. By the triangle inequality and the Cauchy—Schwarz inequal- 
ity, for all x = Y4-1 c jx; we have 


d d d 1/2 
IIx < LI lcallzill <M Y lejl <Ma'??(Y. [ej1?) Md". 5) 
j=l j=l j=l 
This gives one of the two inequalities in the definition of equivalence of norms. 

To prove that a similar inequality holds in the opposite direction, let $2 denote the 
unit sphere in (X, || - 2). Since (c1,...,¢¢) pee c jx; maps the unit sphere of K@ iso- 
metrically (hence continuously) onto S27, Sz is compact. Consider the identity mapping 
I:x++x, viewed as a mapping from S> to (X, ||- ||). The inequality (1.5) implies that / 
is bounded and therefore continuous. Since taking norms is continuous as well and Sz 
is compact, the mapping x ++ ||Jx|| is continuous from S2 to [0,cc) and takes a minimum 
at some point xo € So. 

Denoting this minimum by m, we claim that m > 0. It is clear that m > 0. Reasoning 
by contradiction, if we had m = ||Jxo|| = 0, then Jx9 = 0 in X, hence xo = 0 as an 
element of Sj. Then ||xo||2 = 0, while at the same time ||x9||2 = 1 because xo € S. This 
contradiction proves the claim. 

For any nonzero x € X we have Tb € Sp and therefore ||/ Tel || > m. This gives the 
estimate 


mlx|l2 < [Z|] = [lal 


for nonzero x € X; for trivial reasons it also holds for x = 0. 


Corollary 1.35. Every d-dimensional normed space is isomorphic to K4 In particular, 
every finite-dimensional normed space is a Banach space. 


Proof The first assertion has been proved in the course of the proof of Theorem 1.34, 
and the second assertion follows from it since K¢ is complete. 


Corollary 1.36. Every finite-dimensional subspace of a normed space is closed. 
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Proof By Corollary 1.35, every a finite-dimensional subspace of a normed space is 
complete, and it has been shown in the first paragraph of Section 1.1.b that every com- 
plete subspace of a normed space is closed. 


Corollary 1.37. Every linear operator from a finite-dimensional normed space X into 
anormed space Y is bounded. 


Proof Let(xj)4a1 be a basis for X. If T : X > Y is linear, for x = er cjx; we obtain, 
by the Cauchy—Schwarz inequality, 
d d 
7x =|] bets] < Lleslitss <a, 
asi a= 


where ||x||2 := (x4 _, |c;|*)!/? as in Theorem 1.34 and M := max, <pcq ||TXn||. By The- 
orem 1.34 there exists a constant K > 0 such that ||x||2 < K||x|| for all x € X. Combining 
this with the preceding estimate we obtain 


|| 7x|| <Ma'”? |x\|2 < KMd"/?||x\). 


This means that T is bounded with norm at most KMd!/2. 


Every bounded subset of a finite-dimensional normed space X is relatively compact; 
this follows from the corresponding result for K@ and the fact that X is isomorphic to 
K¢ for some d > 1 by Corollary 1.35. Conversely, a normed space with the property 
that every bounded subset is relatively compact is finite-dimensional: 


Theorem 1.38 (Finite-dimensional Banach spaces). The unit ball of a normed space X 
is relatively compact if and only if X is finite-dimensional. 


The proof depends on the following lemma: 


Lemma 1.39 (Riesz). If Y is a proper closed subspace of a normed space X, then for 
every € > 0 there exists a norm one vector x € X with d(x,Y) > 1-€. 


Here, d(x,Y) = infycy ||x—y|| is the distance from x to Y. 


Proof Fix any xo € X \ Y; such xo exists since Y is a proper subspace of X. Fix € > 0 
and choose yo € Y such that ||xo — yo|| < (1 +€)d(x0, Y). The vector (x9 — yo) /||x0 — yo|| 
has norm one, and for all y € Y we have 


| X0 — Yo y|| = (eee d(xo,Y¥) 1 
Ilxo — voll I|xo — yol| (1+e)d(xo,¥) Ite 
It follows that 
(a = 
\|xo — yol| l+€ 


Since (1+¢)~! — 1 as € | 0, this completes the proof. 
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Proof of Theorem 1.38 It remains to prove the ‘only if? part. Suppose that X is infinite- 
dimensional and pick an arbitrary norm one vector x; € X. Proceeding by induction, 
suppose that norm one vectors x1,...,%n € X have been chosen such that ||x, —x;|| > 5 
for all 1 < j 4k <n. Choose a norm one vector x,+1 € X by applying Riesz’s lemma 
to the proper closed subspace Y,, = span{x1,...,x,} and € = 5 (that Y,, is closed follows 
from Corollary 1.36). Then ||xn41 —x;|| > 5 forall l <j<n. 

The resulting sequence (x;,),>1 is contained in the closed unit ball of X and satisfies 
||x; —xx|| S 1 for all 7 Ak > 1, so (Xn)n>1 has no convergent subsequence. It follows 
that the closed unit ball of X is not compact. 


1.4 Compactness 


Let X be a normed space. By Theorem 1.38, the collections of bounded subsets of X 
and relatively compact subsets of X coincide if and only if X is finite-dimensional. Thus, 
in infinite-dimensional spaces, relative compactness is a stronger property than bound- 
edness. The purpose of the present section is to record some easy but useful general 
results on compactness that will be frequently used. Compactness in the spaces C(K) 
and L?(Q.) will be studied in the next chapter, and compact operators, that is, operators 
which map bounded sets into relatively compact sets, are studied in Chapter 7. 

By a general result in the theory of metric spaces (Theorem D.10), every relatively 
compact set in a normed space is totally bounded, and the converse holds in Banach 
spaces. This fact is used in the proof of the following necessary and sufficient condition 
for compactness. For sets A and B in a vector space V we write 


A+B:= {utv:uEA,veB}. 


Proposition 1.40. A subset S of a Banach space X is relatively compact if and only if 
for all € > 0 there exists a relatively compact set Ke C X such that S C Ke + B(0;€). 


Proof ‘If’: The existence of the sets Ke implies that § is totally bounded and hence 
relatively compact, for if the balls B(x1,¢;€),...,B(Xng,es€) cover Ke, then the balls 
B(x1,¢52€),...,B(%ne,e32€) cover S. 

‘Only if’: This is trivial (take Ke = S for all € > 0). 


The convex hull of a subset F of a vector space V is the smallest convex set in V 
containing F’. This set is denoted by co(F’). When F is a subset of a normed space, the 
closure of co(F’) is denoted by Co(F) and is referred to as the closed convex hull of F. 

As a first application of Proposition 1.40 we have the following result. 


Proposition 1.41. The closed convex hull of a compact set in a Banach space is com- 
pact. 
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Proof Let K be acompact subset of the Banach space X. For every N > | the set 
N 

coy(K) = {3 dan: Xn € K andO0 <A, <1 foralln=1,...,N, i= i} 
n=1 


is contained in the image of the compact set [0, 1]” x K% under the continuous mapping 
that sends ((A1,...,Aw), («1,---,4w)) to DN, AnXn. 

Let € > 0 be arbitrary, let the open balls B(1;€),...,B(Eu;€) cover K, and consider 
an element x € co(K), say Yin Ajx;. For each j = 1,...,k let 1 << mj <M be an index 
such that 


l7—Enyll = min lity — En 


Then 
k 


k 
< VY AylkinGa ll = ¥ Ae He: 
j=1 j=l 


ae 
j=l 


Since Yin NS; = yea mj=m4j)Sm © Com(K), this implies that x € coy(K) + 
B(0;€). This shows that co(K) C coy(K) + B(0;€). It now follows from Proposition 
1.40 that co(K) is relatively compact. 


The second result asserts that strong convergence implies uniform convergence on 
relatively compact sets. 


Proposition 1.42. Let X and Y be normed spaces, let the operators T, € 2(X,Y), 
n> 1, be uniformly bounded, and let T € L(X,Y). If limp +o T, = T strongly, then for 
all relatively compact subsets K of X we have 


lim sup ||Z,.x — Tx|| =0 
Nr vEK 


It will be shown in Proposition 5.3 that if X is a Banach space, strong convergence 
T, — T already implies uniform boundedness of the operators T,,. 


Proof Let K be a relatively compact subset of X, let € > 0 be arbitrary, and select 
finitely many open balls B(x,;€),...,B(xx;€) covering K. Choose N > 1 so large that 
\|7,.x; — Tx;|| < € for all n > N and j = 1,...,k. Let M := sup, ||T;||; this number is 
finite by assumption. Fixing an arbitrary x € K, choose | < jg < k such that ||x — xj. || < 
€. Then, forn > N, 


I|Znx — Tx|] < || Tox — Tox jg || + || TnXig — TXjoll + [| TXjo — TI 


Me+e€+Me=(2M+loe. 


IN IN 


Taking the supremum over x € K, it follows that if n > N, then 


sup ||T,x— Tx|| < (2M + lye. 
xeK 
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Since € > 0 was arbitrary, this proves the final assertion. 


1.5 Integration in Banach Spaces 


In a variety of circumstances, some of which will be encountered in later chapters, one 
wishes to integrate X-valued functions, where X is a Banach space. In order to have 
the tools available when they are needed, we insert a brief discussion of the X-valued 
counterparts of the Riemann and Lebesgue integrals. 


1.5.a The Riemann Integral 


Let K be a compact metric space and let u be 
a finite Borel measure on K. We will set up the 
Riemann integral with respect to u for continu- 
ous functions f : K + X. To this end we need the 
following terminology. A partition of K is a fi- 
nite collection of pairwise disjoint Borel subsets 
of K whose union equals K. The mesh of a par- 
tition is the diameter of the largest subset in the 
partition. 


f-3 


Proposition 1.43 (Riemann integral). Let u be a 
finite Borel measure on a compact metric space 
K, let X be a Banach space, and let f : K > X 
be a continuous function. There exists a unique 
element in X, denoted by fx f dt, with the fol- 
lowing property: for every € > 0 there exists a 5 > 0 such that whenever (Ky is a 


Bernhard Riemann, 1826-1866 


partition of K of mesh less than & and (t,)“_, is a collection of points in K with ty © Ky 
for alln=1,...,N, then 


<€E. 


If. fg Yul) 


The proof of this theorem follows the undergraduate construction of the Riemann 
integral for continuous functions f : [0,1] + K step-by-step and is therefore omitted. 
The element f; fd is called the Riemann integral of f with respect to U. Whenever 
this is convenient we use the more elaborate notation f; f(t) du(t). 


Proposition 1.44. Let u be a finite Borel measure on a compact metric space K, let X 
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be a Banach space, and let f : K — X be a continuous function. Then 


| free < f lurlian. 


Proof For any partition (K,)_, of K and any collection of points (t,)*_, in K with 
ty € Ky, for alln = 1,...,N we have 


PICO) a Maca 


by the triangle inequality. The result follows by taking the limit along any sequence of 
partitions whose meshes tend to zero. 


In the special case where K = [0, 1] and 1 is the Lebesgue measure, the usual calculus 
rules apply (defining differentiability of an X-valued function in the obvious way): 


Proposition 1.45. Let X be a Banach space and let f : [0,1] + X be a function. Then: 


(1) if f is differentiable at the point to € [0,1], then f is continuous at to; 
(2) if f is differentiable on (0,1) and f' =0 on (0,1), then f is constant on (0,1); 
(3) if f is continuously differentiable on [0,1], then 


[ row=se)-s0) 


Proof (1): Fix an arbitrary € > 0. The assumption implies there exists 6 > 0 such that 
if ¢ € [0, 1] with |t — to| < 6, then 


|) (to) 


"(t)|| <e. 


Then ||,f(¢) — f(to)|| < (€ + ||’ (to)||)|¢ — to] and continuity at fo follows. 


(2): The usual calculus proof via Rolle’s theorem does not extend to the present 
setting, as it uses the order structure of the real numbers. 

Fix an arbitrary € > 0. For eacht € (0, 1), the assumption f” (t) = 0 implies that there 
exists h(t) > 0 such that the interval J, := (t — A(t),t+h(t)) is contained in (0, 1) and 


( 
If) -—FO)| <elt—-s], sek. 


Fix a closed subinterval [a,b] C (0,1). The intervals J,, t € [a,b], cover the compact set 
[a, b] and therefore this set is contained in the union of finitely many intervals f,,,...,liy. 
By adding the intervals J, and J, and relabelling (and perhaps discarding some of the 
intervals), we may assume that a =f, b =ty, and I, ,,,, A @ forn=1,...,.N—1. 


Choosing s, € f,, 11,,,, we have 


If r+) — Ftm)I SF rt1) — Fn )I + A (Sn) — Ptr) I 
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< E(tn41 —Sn) +E(Sp — tn) = E(tn41 —th). 
Now let t € [a,b], say t € J. Then 
IF) -—F@I SFO) — FI + MF) — Fa) + IF) — Fh 
<e E(t tk) + E(tk te-1) tes E(t2 th) =€(t—a). 


This being true for all € > 0 it follows that f(t) = f(a) for all t € [a,b]. This proves that 
f is constant on every subinterval [a,b] C (0,1) and therefore on (0, 1). 


(3): For the function g : [0,1] + X, g(t) := f(t) — fi f’(s) ds, we have 


lim “(s(t +h)—g(t))= — lim — a ae f'(s)ds = 


h-0 ho0h 


by continuity, and therefore g is continuously differentiable on [0,1] with derivative 
g’ =0. It follows from (2) that g is constant on (0,1), hence on (0, 1] by continuity, and 
then g(0) = f(0) implies 


= [ (s)as= g(0) = (0) =F), r€ [0,1] 


Taking t = 1 gives the result. 


In Chapter 4 we will sketch a different proof using duality. 


1.5.b The Bochner Integral 


We turn next to the more delicate problem of generalising the Lebesgue integral to 
functions taking values in a Banach space X. The results of this section will be needed 
only in Chapter 13. 

In what follows we fix a measure space (Q,.#). It is a matter of experience that 
if one attempts to define the measurability of a function f :Q— X by imposing that 
f—'(B) be in F for all Borel (equivalently, for all open) subsets of X, one arrives at 
a notion of measurability that is not very practical, the problem being that it does not 
connect well with approximation theorems such as the dominated convergence theorem. 
It turns out that it is better to start from the following necessary and sufficient condition 
for measurability in the scalar-valued setting: A scalar-valued function is measurable if 
and only if it is the pointwise limit of a sequence of simple functions. 

For a function f : Q— K and x € X we define f @x:Q— X by 


(f @x)(@) := f(@)x. (1.6) 


Definition 1.46 (Simple functions, strong measurability). A function f :Q — X is 
called simple if it is a finite linear combination of functions of the form 1p ®x with 
F € F andx € X, and strongly measurable if it is the pointwise limit of a sequence of 
simple functions. 
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A scalar-valued function is strongly measurable if and only if it is measurable, and 
for such functions we omit the adjective ‘strongly’. 


Theorem 1.47 (Pettis measurability theorem, first version). A function f:Q— X is 
strongly measurable if and only if f takes its values in a separable closed subspace Xo 
of X and the nonnegative functions || f(-) — xo|| are measurable for all xo € Xo. 


A second version of this theorem will be proved in Chapter 4 (see Theorem 4.19). 


Proof ‘If’: Let (%1)n>1 be dense in Xo and define the functions @, : Xo > {x1,...,xn} 
as follows. For each y € Xo let k(n, y) be the least integer 1 < k < n such that 


yan] = min lly 
and put $n(y) :=Xx(n,y)- Since (Xn)n>1 is dense in Xo we have 
lim ||n(y)—yl] =9,  y€ Xo. 
noo 
Now define y, : Q — X by 
Yn(@) = On(f(@)), OE Q. 
We have 
{0 €2: Yy()=11}={EQ: |If(@)— xl] = 


soit 
min ||/(@) —xl|} 


and, for2<k<n, 
{@ EQ: W,(@) =x} 


= {@€ 9: ||/(@) || = min |Lf(@)—xj]| < min ||f(@)—xl}. 


In both identities, the set on the right-hand side is in . Hence each y,, is simple, takes 
values in Xo, and for all @ € Q we have 


Jim ||yn(«) — f(@)|| = Jim lion F(@)) — f(@)|| =0. 


“Only if’: Let f, + f pointwise with each f,, simple. Let Xo be the closed linear span 
of the ranges of the functions f,,. Then Xo is separable and f takes its values in Xo. 
Moreover, @ + ||,f(@) —xo|| = limy+. || fn(@) — xo|| is measurable. 


Corollary 1.48. Jf lim,_,.. fn = f pointwise, with each f,, strongly measurable, then f 
is strongly measurable. 


Proof We check the conditions of the Pettis measurability theorem. Every function /f;, : 
Q — X is the pointwise limit of a sequence of simple functions fh», :Q — X, and every 
Jnm takes at most finitely many different values. It follows that f takes its values in the 
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closed linear span of these countably many finite sets, which is a separable subspace of 
X. The measurability of the functions || f, —xo|| implies that || f —xo|| is measurable. 


Definition 1.49 (u-Simple functions). A simple function f = Y*_, 1p, @ xn is called 
L-simple if U(F,) < ° for alln = 1,...,N. For such functions we define 


N 
| fd := +e M(Fn)Xn- 
Q n=1 


We leave it as a simple exercise to verify that /, fdu is well defined in the sense 
that it does not depend on the representation of f as a linear combination of functions 
1p, ©Xp with U(F,) < oe. If f is -simple, the triangle inequality implies 


| raul S [iisiiaw: (1.7) 


Definition 1.50 (Bochner integral). A strongly measurable function f : Q— X is said 
to be Bochner integrable with respect to u if there is a sequence of [l-simple functions 
tn: Q— X such that 


lim f |i falldu =. (1.8) 


In that case we define the Bochner integral of f by 
i) fdu:= lim [ Sr du. (1.9) 
Q ne JO 


The nonnegative functions || f — f,,|| are measurable by the Pettis measurability theo- 
rem, so the integral in (1.8) is well defined. The limit in (1.9) exists since the assumption 
together with (1.7) (applied to f, — fn) implies that (fo fn dl)n>1 is a Cauchy sequence 
in X. We leave it as another simple exercise to verify that fo fd is well defined in 
the sense that it does not depend on the sequence of approximating functions f,. It is 
equally elementary to verify that if Q = K is a compact metric space and -¥ is its Borel 
o-algebra, then every continuous function f : K — X is Bochner integrable with respect 
to uw and the Bochner integral coincides with the Riemann integral. 


Proposition 1.51. A strongly measurable function f : Q— X is Bochner integrable 
with respect to pt if and only if 


[ilitiian <e. 
Q 


In this situation we have 


| frau] < flew. 
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Proof ‘Tf’: Let f be a strongly measurable function satisfying fo ||f|| du < o. Let gp, 
be simple functions such that lim,_,.. 8n = f pointwise and define 


Fn 2= Uj gall <alipl}8n- 


Then each f,, is simple and we have limy_,.. fn = f pointwise. Since || f;,|| < 2||f|] point- 
wise and fo || f|| du <%, each f;, is 4-simple and by dominated convergence we obtain 


lim nf lf — fall du =0. 


‘Only if’: If f is Bochner integrable and the -simple function g : Q > X is such that 
Jollf -—glldu < 1, then 


[iitliau< 1+ f lielldu <e. 
Q Q 


The final assertion follows from (1.7) by approximation. 


Problems 
1.1 Show that in any normed space X, for all x9 € X andr > 0 the following assertions 
hold: 


(a) B(xosr) = {x EX: ||x—xo|| <r} is an open set. 
(b) B(xosr) = {x EX: ||x—xo|] <r} is a closed set. 
(c) B(xo;r) = B(xo;1r), that is, B(xo;r) is the closure of B(xo;r). 


1.2 Let X be a normed space. 


(a) Show that if x,y € X satisfy ||x—y|| < € with 0 < € < ||x||, then y 4 0 and 


a 


(b) Show that the constant 2 in part (a) is the best possible. 


|| <2€. 


1.3. Show that a norm ||- || on the product X = X; x --- x Xy of normed spaces is a 
product norm if and only if ||x||.. < ||x|| < ||x||1 for all x = (x1,...,xy) € X, where 
N 
ee == max lll, Uh 2= 3 Ul 


1.4 Show that if X = X; @--- @Xy is a direct sum of normed spaces, then each sum- 
mand X,, is closed as a subspace of X. 
1.5 Prove that if T € (X,Y) is bounded, then 


||7]| = sup ||7x|] = sup ||7~l]. 
|xl=1 I< 


1.6 


1.7 


1.8 


1.9 


1.10 


1.11 
1.12 


Problems 29 


Let X and Y be normed spaces and let T € &(X,Y). Prove that for all x € X and 
r > 0 we have 


sup ||Tyl| > rITI. 
ye B(x;r) 
Let Xo := C1(0,1) be the vector space of all C!-functions f : [0,1] + K with 
compact support in (0,1). 
(a) Show that X := {f € C[0, 1]: f(0) = f(1) = 0} is a Banach space and that 
Xo is dense in X. 
(b) Show that for each f € Xo the limit lim,_,.. 7, f exists with respect to the 
norm of X and equals f’, where 
f(t+1/n) — F(t) 
T, f(t) = ———__—__—-. 
(c) Show that there are functions f € X for which the limit lim,_,.. J, f does not 
exist in X. 


This example shows that the uniform boundedness assumption cannot be omitted 
in Proposition 1.19. 


Show that if two norms || - || and || - ||/ on a normed space X are equivalent, then 
the norms of the completions of (X, || - ||) and (X, || - ||) are equivalent. 
Let ||-|| and || - ||’ be two norms on a vector space X. Show that the following 


assertions are equivalent: 


(1) there exists a constant C > 0 such that ||x|| < C||x||! for all x € X; 

(2) every open set in (X,||- ||) is open in (X, || - ||‘); 

(3) every convergent sequence in (X, || - ||") is convergent in (X, | - ||); 

(4) every Cauchy sequence in (X, || - ||) is Cauchy in (X, || - ||). 

Let X be a Banach space with respect to the norms || - || and || - ||/. Suppose that 
|| - || and || - ||’ agree on a subspace Y that is dense in X with respect to both norms. 


(a) Show that the norms agree on all of X. 
Hint: Apply Proposition 1.18 to the identity mapping on Y, viewed as a map- 
ping from the normed space (Y, || - ||) to the normed (Y, || - ||’) and as a map- 
ping in the opposite direction. 

(b) Comment on the following “solution” to part (a): Let x € X be fixed and, 
using density, choose a sequence x, — x with x, € Y for all n > 1. Then 
[|x|] = Timn 520 |||] = lim +00 ||n|I" = [l>l’- 

Provide the details to the ‘if’ part of the proof of Proposition 1.13. 

Let X be a normed space. 


(a) Show that if X is separable, then the completion of X is separable. 
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1.13 


1.14 


1.15 


1.16 


1.17 


1.18 


1.19 
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(b) Show that if X is a Banach space and Y is a closed subspace of X, then X is 
separable if and only if both Y and X/Y are separable. 

Determine whether the following sets are open and/or closed in C(O, 1]: 

(a) {f © C[0, 1]: f(t) > 0 for all t € [0, 1]}; 

(b) {f €C[0,1]: f(t) > 0 for all t € (0, 1]}. 
This problem gives an example of a bounded operator that does not attain its norm. 
Let X be the space of continuous functions f : [0,1] > K satisfying f(0) =0. 

(a) Show that X is a closed subspace of C[0, 1]. 


Thus, with the norm inherited from C[0, 1], X is a Banach space. 


(b) Show that the operator T : X — K, 


1 
rpi= [soa 


is bounded and has norm ||7'|| = 1. 
(c) Prove that |T f| < 1 for all f € X with ||f ||. < 1. 


This problem gives an example of a bounded operator whose range is not closed. 
Consider the linear operator T on C[0, 1] given by the indefinite integral 


T(t) = [ Flo)as, 1 € (0, 1]. 


(a) Show that T is bounded and compute its norm. 
(b) Show that R(T) = {f € C![0,1] : f(0) = 0} and conclude that R(T) is a 
proper dense subspace of C[0, 1]. 

Let X be a Banach space and Y be a normed space. Show that if T: X — Y isa 
bounded operator satisfying ||Tx|] > C||x|| for some C > 0 and all x € X, then its 
range R(T) is complete and T is an isomorphism from X to R(T). 

Let X and Y be finite-dimensional normed spaces. Prove that if T,,T € 2(X,Y), 
then the following assertions are equivalent: 


(1) lim,_,.. 7, = 7 uniformly; 

(2) lim, 7, = T strongly; 

(3) limps. Tp, = T weakly. 

Letl< p<. 

(a) Show that @? is a dense subspace of co. 

(b) Show that the inclusion mapping of £? into co is bounded. 


Show that a normed space X and its completion X have the same dual, that is, the 
restriction mapping X* +> X*|x is an isometric isomorphism from X onto X*. 


1.20 


1.21 


1.22 


1.23 


Problems 31 


Let X be a real vector space. The product X x X can be given the structure of a 
complex vector space by introducing a complex scalar multiplication as follows: 


(a+ ib) (x,y) := (ax— by, bx+ay). 
The idea is to think of the pair (x,y) € X x X as “x+iy”. 


(a) Check that this formula for the scalar multiplication does indeed turn X x X 
into a complex vector space. 


The resulting complex vector space is denoted by Xc. 
Suppose now that X is a real normed space. 


(b) Prove that the formula 


I(x) || = sup_ ||(cos @)x + (sin @)y)| 
0¢(0,27] 
defines a norm on Xc which turns it into a complex normed space. Show that 
Xc is a Banach space if and only if X is a Banach space. 
(c) Show that this norm on Xc extends the norm of X in the sense that ||(x,0)|| = 
||(O,x) || = ||x|| for all x € X. 
(d) Show that || (x, y)]] = ||(x, —y)|| for all x,y € X. 
(e) Show that any two norms on Xc which satisfy the identities in parts (c) and 
(d) are equivalent. 


Let X be areal Banach space and let Xc¢ be the complex Banach space constructed 
in Problem 1.20. 


(a) Show that if T is a (real-)linear bounded operator on X, then T extends to a 
bounded (complex-)linear operator Tc on Xc by putting Tc (x,y) := (Tx, Ty). 
(b) Show that ||7c]] = |||. 


As a variation on Proposition 1.40, show that a bounded subset S of a Banach 
space X is relatively compact if and only if for every € > 0 there exists a finite- 
dimensional subspace X¢ of X such that S C X- + B(0;€). 

Show that a subset K of a Banach space X is relatively compact if and only if 
K is contained in the closed convex hull of a sequence (X,)n>1 in X satisfying 
limy—ooXn = 0. 

Hint: For the ‘only if’ part, cover K with finitely many balls of radius 3~” and let 
C,, be the set of their centres; n = 1,2,... Let D) := C, and, for n > 2, 


Dele Sp oye CG in Se Ci, ep ea |S: 


Check that each x € K can be represented as an absolutely convergent sum x = 
Ln>1 dn With d, € D,. Consider the sequence (x, )n>1 given by x; := 2"dn. 


32 


1.24 
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1.27 


1.28 
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Let (Q,.¥) be a measurable space. Adapting the proof of Theorem 1.47 show that 
if f : Q— X is strongly measurable, there are simple functions f, :Q— X such 
that f, > f and || fall < ||,f|| pointwise. 

Let K be a compact metric space, let u a finite Borel measure on K, and let X be 
a Banach space. Prove that every continuous function f : K + X is Bochner inte- 
grable with respect to u and that its Bochner integral equals its Riemann integral. 
Let (Q, .¥, 1) be a measure space and let Xo be a closed subspace of the Banach 
space X. Let f :Q — X satisfy f(@) € Xo for almost all @ € Q. Show: 


(a) if f is strongly measurable as an X-valued function, then f is strongly mea- 
surable as an Xo-valued function. 

(b) if f is Bochner integrable as an X-valued function, then f is Bochner inte- 
grable as an Xo-valued function. 


Let (Q,.¥,U) be a measure space. Show that if T : X > Y is a bounded opera- 
tor and f : Q— X is Bochner integrable with respect to u, then Tf: Q— Y is 
Bochner integrable with respect to and 


Tf fou [ rfap. 


Let (Q,¥,1) be a measure space and let (Q’,.#") be a measurable space. Let 

@ : Q — Q! be measurable and let f : Q’ > X be strongly measurable. Let v = 

11097! be the image measure of under @. 

(a) Show that fo @ is strongly measurable. 

(b) Show that fo @ is Bochner integrable with respect to yw if and only if f is 
Bochner integrable with respect to v, and that in this situation we have 


[feoau= fray. 


Let (Q,.¥,u) be a probability space. Prove that if f : Q— X is Bochner inte- 
grable, then 


[ fewe co{f(@): a € Q}. 


2 
The Classical Banach Spaces 


Before proceeding any further we pause to undertake a detailed study of the classical 
Banach spaces introduced in the previous chapter. 


2.1 Sequence Spaces 


Besides the finite-dimensional spaces K%, perhaps the simplest examples of Banach 
spaces are provided by the class of sequence spaces. By definition, these are spaces 
of sequences which, endowed with a suitable norm, turn into Banach spaces. Here we 
introduce the most important sequence spaces, viz. co and ¢?, 1 < p<. 


The Spaces co and £* The space co consisting of all scalar sequences a = (ax) x>1 
satisfying lim,_,..a% = 0 is a Banach space with respect to the supremum norm 


lal)» := sup lax. 
A justification of this notation is given in the next paragraph. That this is indeed a norm 
is left as an exercise; the proof of completeness runs as follows. Suppose (ans is 
a Cauchy sequence in co. Then each coordinate sequence (a) n> 1 is Cauchy in K 
and therefore has a limit which we denote by a,. We wish to prove that the sequence 
a := (ax)x>1 belongs to co and that limy_5.0 Ila”) — allo = 0. 

Fix € > 0 and choose N so large that ja) — a'”)||.. < € for all m,n > N. Choose N’ 
so large that ja | < € for all k > N" Then, fork > N’, 


lax] < Jax — af” | + al? | = him Jal” — af? | + Jaf” | <e+e=20. 


It follows that limg 50. |az| = 0, so a € co. 


Finally, for all k > 1 and m,n > N we have ja” _ al” < €. Letting m — © while 
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keeping n fixed, for all k > 1 we obtain 
ja” —a| <€. 
Taking the supremum over k > 1 we infer that Ila”) — allo < € for all n > N, and the 
convergence a”) — a in co follows. 
In the same way one proves that the space ¢* consisting of all bounded scalar se- 


quences a = (ax)x>1 is a Banach space with respect to the supremum norm. This space 
contains cg isometrically as a closed subspace. 


The Spaces £? For 1 < p <9, the space ¢? of scalar sequences a = (ay) x51 satisfying 


Nally = (X Jax?) 


kl 
is finite is a Banach space with respect to the norm || - ||». That this is indeed a norm on 
£? is nontrivial; the validity of the triangle inequality ||a+b||p < |la||p + ||b||p can be 
proved by following the line of proof of Proposition 2.19. Completeness of @? can be 
proved as in Theorem 2.20. Alternatively, these facts can be deduced as special cases 
of Proposition 2.19 and Theorem 2.20 by taking Q = {1,2,3,...} with the counting 
measure, that is, the measure which assigns mass | to every element of Q. 
It is easy to see (see Problem 2.1) that 1 < p< q< implies ¢? C ¢4 and 


llalla < llallp 
for all a € @?, and that if a € ¢? for some | < p < %, then 
Jim [all = [lal 
q2P 
This justifies the notation || - ||.. for the supremum norm. 


Remark 2.1. In some applications it is useful to use countable index sets J other than 
the positive integers. We then define 


1/p 
= i, ie ’ 
llall eon) : (Xl ) 


where (i,)n>1 is an enumeration of J. This definition is independent of the choice of the 
enumeration, and the space €?(/) of all mappings a : J > K for which this expression is 
finite is again a Banach space. 


2.2 Spaces of Continuous Functions 


In this section we study some properties of the space C(K) of continuous functions 
defined on a compact topological space K. 
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2.2.a Completeness 


It is a standard result in any introductory course in Analysis that the uniform limit of 
a sequence of continuous functions is continuous. The following theorem recasts this 
result as a completeness result. 


Theorem 2.2 (Completeness). Let K be a compact topological space. The space C(K) 
is a Banach space with respect to the supremum norm 


II fll == sup |f(x)|.- 
xeK 


The elementary verification that this is indeed a norm is left to the reader. The above 
supremum is finite (and actually a maximum) since K is compact. 


Proof Suppose that (fn)nz1 is a Cauchy sequence in C(K). Then for each x € K, 
(fn(x))n>1 is a Cauchy sequence in K and therefore convergent to some limit in K 
which we denote by f(x). We will prove that the function f thus defined is continuous 
and that lim). || fr — f leo = 0. 

Fix € > 0 and choose N > | so large that || fi — finlleo < € for all m,n > N. Then in 
particular for all m,n > N and all x € K we have | f, (x) — fin(x)| < €. Passing to the limit 
m —> co while keeping n fixed we obtain 


| fn(x) — F(X)| < €. (2.1) 


Now fix x € K arbitrary and let U C K be an open set containing x such that | f(x) — 
fn (x’)| < € whenever x’ € U. Then, for x’ € U, 


IF) — FO) SF) — fv) + fv) — fv) + [fv @) — £@)| Se +e +e =3e, 


where we applied (2.1) to n = N and the points x and x. An argument of this type is 
called a 3€-argument. This proves the continuity of f at the point x. Since x € K was 
arbitrary, f is continuous and therefore belongs to C(K). Finally, since (2.1) holds for 
all x € K it follows that 


Il fn — flee = sup | fn(x) — f(x) <€ 
xeK 


for all n > N. This proves that limp. || fn — flo. = 0. 


We give three more examples of spaces of functions that are Banach spaces with 
respect to the supremum norm. The proofs that these spaces are complete are similar to 
the ones for co, €”, and C(K), and are left as an exercise. 


e The space B,(X) of bounded Borel measurable functions on a topological space X. 

e The space C,(X) of bounded continuous functions on a topological space X. 

e The space Co(X) of continuous functions on a locally compact topological space X 
which vanish at infinity (the precise definitions are given in Section 4.1.c). 
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2.2.b The Stone—Weierstrass Approximation Theorem 


The Stone—Weierstrass theorem provides a useful density criterion for the spaces C(K). 
We begin with the more elementary Weierstrass approximation theorem for K = [a,b]. 


Theorem 2.3 (Weierstrass approximation theorem). The polynomials with coefficients 
in K are dense in C{a,]. 


Proof By translation and scaling it suffices to prove the theorem for the space C[0, 1]. 
Our proof is constructive in that it produces an actual sequence of polynomials approx- 
imating a given function. Let f € C[0, 1] be arbitrary and fixed and define the Bernstein 
polynomials associated with f by 


n 
fy) -— 7 ky x -k 
By oy (1s (l—x)"", x€[0,1],n EN. 
We will show that lim). ||B\’) — f||.. = 0. To begin with, the binomial identity 


y (i) —x)"* = [x+(1—x)J" =1 (2.2) 
implies 


ails) fl) = ¥ (2) 80-9) - Fl). 


Fix an arbitrary € > 0. Since f is uniformly continuous there is a real number 0 < 6 < 1 
such that | f(x) — f(x’)| < € whenever x,x’ € [0, 1] satisfy |x—x’| < 6. Fix x € (0, 1] and 
setI:={0<k<n: |k—x| <d} andl’ ={0 <k <n: k ¢I}. The sum over the indices 
k € Ican be estimated by 


¥ (tay rs (x)| < cD (i )at (I-xy"*=e, 23) 


kel 


while for k € I' we have 57 < es —x)? and therefore 
k 
ay (Pax E)- £0] < Bi Ger ire 
kel’ 
<2 fle (A —x) ae \s (1—x)" 
2 
n 


© ol fle x 9 Aig, 


where in (*) we used the binomial identity (2.2) in combination with the identities 
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(which are proved by induction on n) 


y = (t)aa —x)k§ax, ae 


k=0 f0 in 
to see that 

k =i 1 vs 

yew (fAa art" + —x—2x-x4 gue ay 

k=0 7 k n n n 


Combining things, we obtain 
2 
IBn’(x) -— f@)| < e+ sllflle. 
n 
Since this inequality holds for all x € [0, 1] and € > 0, it follows that 


2 
Bn? — fle < sol flle 


This shows that f can be approximated arbitrarily well by polynomials. 


Remark 2.4. The same argument shows that if f : [0,1] — K is any function which is 


continuous at a point xo € [0, 1], then lim,_,.. BY) (xo) = f (x0). 


The proof of Theorem 2.3 has an interesting connection with the law of large num- 
bers. Suppose &1,&2,&3,... are independent identically distributed random variables 
taking the values 0 and 1 with probability 1 — p and p, respectively. Let 


1 n 
Sn = ye Ex. 
My 


Suppose that f : [0,1] + K is continuous at a point xo € [0, 1]. Denoting expectation and 
probability by E and P respectively, 


and 


k n 
P(s =-) - ka — pyr 
em (i) (1—p) 
From Remark 2.4 we therefore obtain 


a k 
lim Ef(S,) = bus 6 fpr —p)*= lim By (p) = f(p). 


n—-0o 


In particular we recover the weak law of large numbers, which is the assertion that this 
convergence holds for all f € C[0, 1]. 

The proof of the Weierstrass theorem using Bernstein polynomials offers little room 
for generalisation, but the theorem itself does admit a far-reaching generalisation: 
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Theorem 2.5 (Stone—Weierstrass theorem, algebra version). Let K be a compact Haus- 
dorff space and suppose that Y is a subspace of C(K) with the following properties: 


@) 1e€Y; 

(ii) g € Y implies 8 € Y; 
Gii) g € Y andh€Y implies gh €Y; 
(iv) Y separates the points of K. 


Then Y is dense in C(K). 


By definition, condition (iv) means that for 
any two distinct points x,y € K there exists a 
function g € Y such that g(x) 4 g(y). 

As a preliminary observation we note that 
it suffices to prove the theorem for real-valued 
functions. Indeed, the complex version of the 
theorem follows from the real version as follows. 


If g € Y, then the real-valued functions Reg = 
5(g +3) and Img = 7 (8 —) belong to Y. From Puushe, 
this it is easy to see that the real-linear space Yp z 
of all real-valued functions contained in Y satis- Karl Weierstrass, 1815-1897 
fies (i)-(iv) again. Now if f =u+iv € C(K) we 
may use the real version of the theorem, with Y 
replaced by Yp, to approximate u and v by functions uy,v_p € Yr. Then the functions 
Un + iV, approximate f. 
The real version of the theorem will be deduced from its companion where condition 
(iii) is replaced by closedness under taking pointwise absolute values: 


Theorem 2.6 (Stone—Weierstrass theorem, lattice version). Let K be a compact Haus- 
dorff space and suppose that Y is a subspace of C(K) with the following properties: 


@) 1eY; 
(iil) g € Y implies g € Y; 
(iii) g € Y implies |g| € Y; 
(iv) Y separates the points of K. 


Then Y is dense in C(K). 


Proof Reasoning as before, it suffices to prove the theorem over the real scalars. 
For the minimum a b := min{a,b} and maximum aV b := max{a,b} we have the 
formulas 
ahb= 


((a+b)—|a—b|), aVb==((a+b)+|a—D)). 


Nile 


1 
2 
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They imply that Y is closed under taking pointwise maxima and minima. 

Fix f € C(K) and € > 0. 

Step 1 — We prove that for each x € K there exists a function g, € Y such that g,(x) = 
f(x) and gy < f +€ pointwise. 

Since Y is a subspace containing the constant functions and separating the points of 
K, for all y € K there exists a function g,, € Y such that g,,(x) = f(x) and g.,(y) = f(y). 
The set Uyy = {z © K: gxy(z) < f(z) + €} is open and contains both x and y. Since K is 
compact, the open cover {U,y : y € K} has a finite subcover, say {Uyy, : n= 1,...,Nc}. 
The function gy ‘= gxy, A+++ A 8xyy, has the required properties. 

Step 2 — We prove that there exists a function g € Y such that f—e < g < f+é; this 
implies || f — g||.. < € and concludes the proof. 

For each x € K the set U, = {z € K: f(z) —€ < gx(z)} is open and contains x. Since K 
is compact, the open cover {U, : x € K} has a finite subcover, say {U,, : n= 1,...,N}. 
The function g := gy, V--- V 8x, has the required properties. 


Proof of Theorem 2.5 As has already been noted that it suffices to prove the theorem 
over the real scalar field. Let Y be a subspace of C(K) with the properties (i)—(iv) stated 
in Theorem 2.5. If we can approximate any f € C(K) with functions from the closure Y, 
we can also approximate with functions from Y. Since Y also satisfies the properties (i)— 
(iv) of Theorem 2.5, we may assume that Y is closed. The strategy of the proof is then 
to show, under this additional closedness assumption, that Y satisfies the assumptions 
of Theorem 2.6. For this we need to show that if f € Y, then also |f| € Y. 

Fix a function g € Y and let € > 0. Since K is compact, the range of g is contained in 
some compact interval [a,b]. By Theorem 2.3 there exists a polynomial p : [a,b] > R 
such that ||p — q||.o < €, where q(t) := |t| is the absolute value function. Since Y is 
an algebra containing the constant functions, po g belongs to Y and satisfies || po g— 
Ig\llccK) < €. Since € > 0 was arbitrary and Y is closed, it follows that |g| € Y = Y. 


Remark 2.7. Theorems 2.5 and 2.6 are stated for compact Hausdorff spaces, but the 
Hausdorff property was not used in the proofs. Note, however, that the separation-of- 
points assumptions in these theorems already imply the Hausdorff property. 


As a first application of Theorem 2.5 we have the following separability result. 
Proposition 2.8. [f K is a compact metric space, then C(K) is separable. 


Proof We must find a countable set in C(K) with dense span. Let (x,)n>1 be a count- 
able dense set in K (such a sequence can be realised by covering K, for each integer 
k > 1, with finitely many open balls of radius 1/k using compactness and collecting 
their centres). We may assume that all points in this sequence are distinct. For all pairs 
m £n the open balls Ban = B(Xm:; 5d (XmXn)) and Bam = B(xn; 54 (Xn,Xm)) have dis- 
joint closures. The collection 4 = {Bm : mn} is countable and has the property that 
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whenever x, y € K are two distinct points, they can be separated by two balls contained 
in & with disjoint closures. By Urysohn’s lemma (Proposition C.10), for any two balls 
Bo := Bing,ny aNd By := By, n, in B with disjoint closures there exists function f € C(K 
such that f = 0 on Bo and f = 1 on B,. The subspace Y spanned by the countable set of 
all finite products of functions of this form and the constant-one function 1 satisfies the 
assumptions of Theorem 2.5 and is therefore dense in C(K). 


The next two examples give further illustrations of the Stone—Weierstrass theorem. 


Example 2.9. The trigonometric polynomials, that is, linear combinations of the func- 
tions 


én(@) :=exp(in@), neZ, 


are dense as functions in C(T), where T denotes the unit circle, which we think of as 
parametrised with (—pi,7]. Indeed, they satisfy the requirements of Theorem 2.5. An 
explicit procedure to approximate functions in C(T) with trigonometric polynomials is 
described in Section 3.5.a. 


Example 2.10. Let K),...,K, be compact topological spaces. The linear combinations 
of functions of the form 


f(x) = filai)-++ fila), = (¥1,---,%%) € Kr x +++ x Kg, 


with f; € C(K;) for all j = 1,...,k, are dense in C(K, x --- x K;). Indeed, they satisfy 
the requirements of Theorem 2.5. 


2.2.c The Arzela—Ascoli Compactness Theorem 


The next theorem gives a necessary and sufficient condition for relative compactness in 
C(K). We need the following terminology. A subset S C C(K) is said to be equicontin- 
uous at the point x € K if for all € > 0 there exists an open set U in K such that for all 
x’ €U and f €S we have | f(x) — f(x’)| < e, and equicontinuous if it is equicontinuous 
at every point of K. The set S is said to be pointwise bounded if for all x € K we have 


sup res |f(x)| < &. 
Theorem 2.11 (Arzela—Ascoli). Let K be a compact topological space. A subset of 


C(K) is relatively compact if and only if it is pointwise bounded and equicontinuous. 


An equivalent way of formulating the theorem is that a subset of C(K) is compact if 
and only if it is closed, pointwise bounded, and equicontinuous. 


Proof ‘If’: Let SC C(K) be pointwise bounded and equicontinuous, and fix € > 0. By 
equicontinuity, for every x € K there is an open set U, in K such that | f(x) — f(x’)| <e€ 
for all x’ € U, and f € S. By compactness, finitely many of these open sets cover K, say 
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U;,,...,Ux,. By pointwise boundedness, for each j = 1,...,k the set {f(x;): f € S} is 
bounded. It follows that we can find c;,...,cy € K such that for all f € Sand j=1,...,k 
we have mini<n<w|f(xj) — cn| < €. Let VW = {n= (n,...,m): 1<nj <N for all 
j=1,...,k}. Forne€ let 


Bn ={f ES: |f(xj) — cn;| < € for all j = 1,...,k}. 


By what we just observed, 


S= |) Bn. 


neEN 


Suppose that f,g € B, and let x € K be arbitrary. Then x belongs to at least one of the 
sets Ux;. Then, 


IF) — 8) SF @) — FO) + (FC) — 8G) + 18a) — 8) 


E+|f (xj) — enjl + lenj — 8) +e < 4€, 


< 
< 
where the last inequality holds uniformly with respect to x € K. It follows that || f — 
glo < 4. If, for each n € VY for which B, is nonempty, we pick a function f, € B, 
and consider the open balls B(f,;4€), we obtain a finite cover of S with 4¢-balls. Since 


€ > 0 was arbitrary this means that S is totally bounded and hence relatively compact, 
by Theorem D.10. 


‘Only if’: Suppose now that S C C(K) is relatively compact. Then obviously S is 
uniformly bounded and hence pointwise bounded, so all we need to do is to prove that 
S is equicontinuous. To this end let x9 € K and € > 0 be arbitrary and fixed. We can 
cover the compact set S with finitely many (say, 7) open balls of radius €. Let fi,..-, fn 
be their centres. Using the continuity of these (finitely many) functions we can find an 
open set U containing x such that | fj(x) — fj(xo)| < € for all x € U and j = 1,...,n. 
Now consider an arbitrary f € S. Choose jo € {1,...,n} such that || f — fj, || < €; this 
is possible by the choice of the functions /|,..., fn. Then for all x € U we have 


IF) — FRO) S [F@) — Fin 2) + [Fin &) — Fig (*0)| + Fin 0) — F(20)| < 3e- 


This verifies the equicontinuity condition. 


2.2.d Applications to Differential Equations 


As an interlude to the main development of the theory, in this section we apply the 
completeness result of Theorem 2.2 and the compactness result of Theorem 2.11 to 
study the following initial value problem: 


he =f(t.u(t)), 1€ [0,7], 


IVP 
u(0) = uo, re 
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where f : [0,7] x K¢ — K¢ is continuous and up € K¢ is given. 


Global Existence and Uniqueness A global solution is continuously differentiable 
function u : [0,7] + K satisfying u(0) = uo and uw’ (t) = f(t, u(t) for all t € [0,T]. 


Theorem 2.12 (Existence & uniqueness, Picard—Lindelof). If f : [0,7] x K¢ + K¢ is 
continuous and there is a constant L > 0 such that for allt € [0,T] and x,x' € K¢ we 
have 


|f (t,x) — f(t,x)| < L|x—x'|, 
then (IVP) admits a unique global solution. 


The condition on f is often summarised by saying that f is Lipschitz continuous in 
its second variable, uniformly with respect to its first variable. 
The proof of Theorem 2.12 is based on the following abstract fixed point theorem. 


Theorem 2.13 (Banach fixed point theorem). Let X be a complete metric space and let 
f:X —X be uniformly contractive, that is, there exists a constant 0 <c < 1 such that 


d(f (x), fx’) <cd(x,x’), x,x EX. 


Then f has a unique fixed point, that is, there exists a unique element x € X with the 
property that f(x) =x. 


Proof If x and x’ are both fixed points, then d(x, x’) = d(f (x), f(a’)) < cd(x,x’), which 
is only possible if d (x,27 ) =0, that is, if x = x’. It follows that a fixed point, if it exists, 
is unique. 

To prove that a fixed point exists, choose an arbitrary x9 € X and define the sequence 
(Xn)n>0 by Xn41 = f(%n) for n > 0. We claim that this is a Cauchy sequence. Indeed, 
for all n > 1 we have 


A(Xn+1,Xn) = A(f (Xn), f(%n-1)) < cd(Xn,Xn-1), 


and therefore by induction one sees that d(x%p41,Xn) < c’~! d(x2,x1) for all n > 1. For 
allm >n>WN we have 


d(Xm,Xn) S d(Xm,Xm—1) algae +d(Xn+1,Xn) 
co N-1 
<(c™ 7 $+ +6" !) -d(x2,01) < ( > ) -d(x2,*1) = —_ -d(x2,%1), 
k=N-1 I—c 
and the right-hand side can be made small by taking N large. This proves the claim. 
Since X was assumed to be complete, the sequence (x,),>1 converges in X. Let x 
be its limit. Then the continuity of f implies f(x) = limy +o f (Xn) = limp 00X41 =X, 
which shows that x is a fixed point for f. 
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By C((0,T];K%) we denote the space of all continuous functions f : [0,7] > K% 
Endowed with the supremum norm, this space is a Banach space. Indeed, suppose that 
(f™)ns1 is a Cauchy sequence in C({0,7];K?). Then the d sequences of coordinate 
functions ( ae) n>1 are Cauchy in C[0, 7] and therefore converge to limits f; in C[0,T]. 
This easily implies that the sequence ( cae converges in C([0,7];K“) to the function f 
with coordinate functions fj. 

We will use the Banach fixed point theorem to prove that the (nonlinear) mapping 
Ir : C([0,T];K?) > C([0,T];K“) defined by 


(Iru)(t) := w+ [ Fls,u(s)) ds, t € [0,7], 


where the integral is interpreted as a K¢-valued Riemann integral, has a fixed point. 
This will prove the theorem in view of the next lemma. 


Lemma 2.14. A function u € C([0,T];IK“) satisfies IVP) for all t € [0,7] if and only if 
u is a fixed point of Ir. 


Proof Indeed, u is a fixed point of Jy if and only if u(t) = uo + {5 f(s,u(s)) ds for 
all t € [0,7]. By integration, this identity holds if u is a solution, and conversely if the 
identity holds, then u is continuously differentiable (since the right-hand side is) and by 
differentiation we obtain that u is a solution. 


Proof of Theorem 2.12 Let us start with a preliminary estimate that will be refined 
shortly. For all u,v € C((0,T];K®) and all t € [0,7] we have 


(tr(w)0) Ur = | [’Fls.uls)) = F16,v46))as 


t t 
< [ Lu(s)—v()las< [ Llu —v||ds < LT||u—vIl. 
0 0 


Taking the supremum over t € [0,7] we find that 
IlEr(u) —Ir(v)|| < LT ||u— yl]. 


If LT < 1, then Jr is uniformly contractive and the Banach fixed point theorem guaran- 
tees the existence of a unique fixed point. This proves Theorem 2.12 in the special case 
that the smallness condition LT < | is satisfied. 

To get around this condition we modify the norm of C((0,7];K¢). Fix real number 
A > 0 (ina moment we will see that we need A > L), and define 


lIflla:= sup e™|f(2)]. 


te [0,7] 
It is clear that this defines a norm on C({0,7];K“) and we have 


eM FI <lflla <I. 
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This implies that a sequence in C({0,7];IK“) is Cauchy with respect to the norm || - ||, if 
and only if it is Cauchy with respect to the norm || - ||, and since C([0, 7]; KK“) is complete 
with respect to the latter, we conclude that C([0,7];K“) is a Banach space with respect 
to the norm || - ||,. Using this norm, we redo the above computations and find 
lr) —FrOMlla = sup e™1Gr(u))() - Gr) 
te€|0,T 


t 
< sup gy, Le*e~}|u(s) —v(s)| ds 
r¢(0,7 0 


t 
< sup eMu—vlia f Le* ds 
te[0,T 0 


L L 
At At 
= sup e"'llu—vlla- a (e" —1) < a llu—vlla- 
1€[0,7 A A 

Hence if we choose A > L, then /r is uniformly contractive on C([0,7];K“) with respect 
to the norm || - ||,. Now an application of the Banach fixed point theorem produces a 
unique fixed point for Jr. 


Remark 2.15. Theorem 2.12 remains true if we replace the interval [0,7] by [0,c¢) and 
assume that f : [0,00) x K4 + K@ satisfies 


| f(t.) — f(t,x')| < Llx x 


for all t € [0,0c) and x,x’ € K“ Indeed, the preceding argument produces a solution uy 
on every interval [0,7]. We may now define u : [0,00) + K@ by setting u := ur on the 
interval [0,7]. Since by uniqueness we have ur = us on [0,5 AT], this is well defined. 
The resulting function is differentiable and satisfies (IVP) on every interval [0,7], hence 
on all of [0,°), and is therefore a global solution on {0,c). 


Remark 2.16. All that has been said extends to the case where K? is replaced by a 
Banach space X. This equally pertains to the results of the next paragraph. 


Local Existence As an application of Theorem 2.11 we present a local existence result 
for differential equations with continuous right-hand side. In contrast to the situation 
in Theorem 2.12, where a Lipschitz continuity assumption was made, we do not get 
uniqueness of the solution. 

We say that the problem admits a local solution if there exists 0< 6 < T anda 
differentiable function wu : [0,6] — K¢@ satisfying u(0) = uo and u'(t) = f(t,u(r)) for all 
t € [0,6]. 


Theorem 2.17 (Local existence, Peano). If f : [0,7] x K4 > K¢ is continuous, then the 
problem (IVP) admits a local solution. 
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A global solution need not always exist. In- 
deed, u(t) = 1/(1—1), t € [0,6] with0<6 <1, 
is a local solution of the problem 


w(t) = (u(t), tr [0,1], 
u(O) = 1, 


but this problem does not have a global solution. 
This follows from the fact that a local solution 
on a subinterval [0, 5], if one exists, is unique. To 
see this, suppose that u; and u2 are two solutions 
on [0,6]. Both uw; and uz are continuous, hence 
bounded, let us say by the constant M. Consider 
now the function @(y) := min{y?,M}. This func- 
tion is globally Lipschitz continuous on [0,6] x R, and therefore the problem 


u(t) =9(u(t)), 1 € [0,6], 
u(O) =1 


Giuseppe Peano, 1858-1932 


has a unique solution on [0,6], say u. On the other hand u and uz are solutions on (0, 5], 
because (u(t) = (u(t))* and @(uo(t)) = (u2(t))* for all t € [0, 5], and therefore we 
must have uj = u = uz on [0,6]. The upshot of all this is that if a global solution exists, 
it must be equal to 1/(1 —1) on every subinterval {0, 6], hence on the interval (0, 1); but 
the function 1/(1—+) cannot be extended to a continuous function on [0, 1). 

The above uniqueness proof for local solutions made use of the fact that the right- 
hand side was locally Lipschitz continuous in the second variable in a neighbourhood 
of the initial value. In general, however, a local solution need not be unique. Indeed, 
both w(t) = 0 and u2(t) = 13/? are solutions of the problem 

W() = Fun)", te [0,1], 
u(0) =0. 


The function @ (t,x) := 3y1/ 3 fails to be locally Lipschitz continuous in the second vari- 
able in a neighbourhood of the initial value 0. 


As a first step towards the proof of Theorem 2.17 we show that the problem (IVP) 
is equivalent to an integrated version of it. For the remainder of this section we assume 
that f is continuous. 


Lemma 2.18. A function u € C((0, 6]; K@) is a local solution of (IVP) if and only if for 
allt € [0,6] we have 


u(t) = wot [ Flo.u(s))as. (2.4) 
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The function s ++ f(s,u(s)) is continuous on [0,6], so the integral is well defined as 
a Riemann integral with values in K¢. 


Proof If u € C({0,6];KK%) is a local solution, then for all  € [0,6] we have 


u(t) — uy = u(t) — u(0) = Ee w'(s)ds = [ Flsu(s))os 


and (2.4) holds for all t € [0, 5]. Conversely, if u € C({0, 6];IK@) satisfies (2.4) for all t € 
[0, 5], then wis continuously differentiable on [0,6]. Using u(0) = uo and differentiating, 
we find 


WO=F | flouls))as = Fu) 


for all t € [0,6], that is, IVP) holds. 


Now we are ready for the proof of Theorem 2.17. It relies on a compactness argu- 
ment. The idea is to construct, for small enough 6 € (0,7T], a sequence of approxi- 
mate solutions (u,)>1 in C([0,5];K%) and show its equicontinuity (the definition of 
which extends to vector-valued functions in the obvious way). An appeal to the Arzela— 
Ascoli theorem (which extends to the vector-valued case as well, without change in the 
proof) then produces a subsequence (u,,)¢>1 that converges in C([0, 5]; KK“). The limit 
u € C([0,6];K2) will be shown to solve (IVP) on the interval [0, 5]. 


Proof of Theorem 2.17 Let M := supy, x)<J0,7)xB(up;1) |f(t,*)| and 6 := min{T,1/M}. 
For n= 1,2,... we equipartition the interval [0,5] using the partition points ft; := jAn 
for j =0,...,2", where hy, := 2~"6, and define inductively 


un (0) = uo, Un(tj+1,n) — Un(tjn) +Inf (tin, U(tin)), jJ= 1, ios ons (2.5) 


For the remaining values of t € [0,5] we define u,,(t) by piecewise linear interpolation. 
Since each u, is piecewise continuously differentiable with derivatives bounded by M@ 
(by an inductive argument based on (2.5), each u(t; ,) belongs to B(uo; 1)), we have 


|un(t) —Un(s)| << M|t—s|, s,t € [0,6]. 
This implies that the functions u, are equicontinuous. The estimate 
Jun (t)| < |un(t) — un (O)| + [een (0)| < MS + |uo| <1+ uo], + € [0,4], 


shows that they are also uniformly bounded. By the Arzela—Ascoli theorem, some sub- 
sequence (ip, )k>1 converges to a limiting function w in C([0,5];K“). 
Since f is uniformly continuous on [0,7] x B(uo; 1) we have limy,..C, = 0, where 


Cii= sup sup [f(t,x)-f(t,y)]. 


|t—s|<hn |x—y|<hnM 
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Writing 
hel ti+1,n 
Un(t) = uo + > / los (8) f (tj.nsUn(tjn)) ds, tE (0, 5], 


j=0 Vin 


we see that 


un(t) — (uo + i F(s,un(s)) ds) 


2" 


LV rtisin 
x [ton (sdf ltjntaltin)) ~F(6.tmn(8))| A 


The right-hand side tends to 0 as n — o. Taking limits, it follows that 


u(t) = lim up, (t) = uo + lim [ Floun())as = w+ [ Fls.u(s)) ds, 


k-00 k- 00 


and therefore u solves the integrated version of (IVP) on [0,6]. By Lemma 2.18, u then 
solves (IVP) on the interval [0, 6]. 


2.3 Spaces of Integrable Functions 


Let (Q, F, uw) be a measure space and fix 1 < p < 0. We define “?(Q) as the set of all 
measurable functions f : Q — K such that 


| |f|P du <ee. 
Q 
For such functions we set 


Wile == (f irraw) 


For p = © we define @”(Q) as the set of all measurable functions f : Q — K that are 
L-essentially bounded,, meaning that there is set N € Y of u-measure 0 such that f is 
bounded on CN. For such functions we define || /\|.. as the -essential supremum of f, 


|| flo := H-esssup|f(@)| := inf{r >0: |f| <r p-almost everywhere }. 
a@cQ 


When there is no risk of confusion, the measure Z is omitted from this notation. 
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The spaces ? (Q) are vector spaces: 
Proposition 2.19 (Minkowski inequality). Let 


1<p<o. For all functions f,g € 2?(Q) we 
have f+g€ L?(Q) and 


lf +8llp < Ilfllp + liglly- 


Proof The result is trivial for p = ©, so we only 
consider the case 1 < p<. 

By elementary calculus it is checked that for 
all nonnegative real numbers a and b one has 


(a+b)? = inf t'-Pa? + (1-1)! bP. 
te(0,1) 


Henri Lebesgue, 1875-1941 


Applying this identity to | f(@)| and |g(@)| with @ € Q and integrating with respect to 
L, for all fixed t € (0,1) we obtain 


[irsirans fisitledran se f ifirqu+a—o' f isiay. 
Stated differently, this says that 


IF +allo se PU Flp + —2)'Pllalls. 


Taking the infimum over all t € (0,1) gives the result. 


(Q), because || f||, = 0 only implies 
that f = 0 u-almost everywhere. In order to get around this imperfection, we define an 
equivalence relation ~ on 2? (Q) by 


f~g<f=g pu-almost everywhere. 
The equivalence class of a function f modulo ~ is denoted by [f]. On the quotient space 
L?(Q) = 2?(Q)/~, 


whose elements are the equivalence classes [f] of functions f € #?(Q), we define a 
scalar multiplication and addition in the natural way: 


cfl:=lefl, [Lfl+lsl:=[f+sl- 


We leave it as an exercise to check that both operations are well defined. With these 
operations, L?(Q) is a normed vector space with respect to the norm 


ILA Ilo == IFllo- 


When we explicitly wish to express the dependence on ¥ or f we write L?(Q,.F) 
or L?(Q, w). Following common practice we make no distinction between functions in 
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L? (Q) and their equivalence classes in L?(Q.), and call the latter “functions” as well. 
In the same vein, we will not hesitate to talk about the “sets” 


{@EQ: f(@) € B} 


when B C K is a Borel set and f is an element of L?(Q). The rigorous interpretation 
is that it defines an equivalence classes of sets in ¥, the representatives of which are 
obtained by selecting pointwise defined measurable representatives for f. 


2.3.a Completeness 


For the remainder of this section we fix a measure space (Q, ¥,). The main result of 
this section is the following completeness result for the spaces L? (Q). 


Theorem 2.20 (Completeness). For all 1 < p< the normed space L? (Q) is complete. 


Proof First let 1 < p < ©, and let (fn)n>1 be a Cauchy sequence with respect to the 
norm || - ||» of L’(Q). By passing to a subsequence we may assume that 


1 
|fnt fallp < Qn? AH=1.2 0003 
Define the nonnegative measurable functions 
BN = 2 fnti— Sal) 8°= z [fn — tals 
with the convention that fp = 0. By the monotone convergence theorem, 
Pdu= li Pp d 
ie ooh ae I Sy CH 
Taking pth roots and using Minkowski’s inequality we obtain 
IIsllp = Jim lawl < jim oD Il fn — fallp = oy Il fut — fallp < 1+ [filly 


If follows that g is finitely valued u-almost everywhere, which means that the sum 
defining g converges absolutely U-almost everywhere. As a result, the sum 


f= Y Utes ~ fr) 


converges on the set {g < co}. On this set we have 


N-1 


f = jim Y fot — fn) = Jim fi. 


N- yoo N20 


50 The Classical Banach Spaces 


Defining f to be identically zero on the null set {g = 0}, the resulting function f is 
measurable. From 


mir =[ Gash) 


<(} > ns —fal) <a? 


it follows | f|? < |g|? and hence 


If — ful? <2? (5 (FI + Lfivl))? < 2? -"(FPP + Lvl?) < 2? lal’, 


using the convexity of t +> ft? (recall that a function f : ] > R, where / is an interval, 
is called convex if for all x9,x1 € J andO <A <1 we have f((1—A)xp +Ax1) < (1- 
2) f (xo) +AFf(x1)). From the dominated convergence theorem we conclude that 


Jim | f — fvllp = 0 


We have proved that a subsequence of the original Cauchy sequence converges to f 
in L? (Q). As is easily verified, this implies that the original Cauchy sequence converges 
to f as well. This completes the proof for exponents | < p < ~. 

It remains to establish the result for p = o. Let (fn)nz1 be a Cauchy sequence with 
respect to the norm || - ||.. of L°(Q). By passing to a subsequence we may assume that 


1 
ll fn+1 — fnlleo < eS i oe 


Choose 1-null sets F;, such that | fn41(@) — fn(@)| < + for all w € CF,,. Defining the 
functions gy and g as before, we note that outside the u-null set F := Un>1 F,, we have 
uniform convergence g, — g. Defining the function f as before, this implies that fy > f 
uniformly outside F’. This, in turn, means that fy > f in L?(Q). 


In the course of the proof we obtained the following result: 


Corollary 2.21. Every convergent sequence (fn)n>1 in L?(Q), with 1 < p<, has a 
Lt-almost everywhere convergent subsequence (fn,)k>1, and this subsequence may be 
chosen to satisfy | fn,| < g almost everywhere for some fixed 0 < g € L?(Q). 


In the majority of applications the first part of this corollary suffices, but the sec- 
ond part is sometimes helpful in setting the stage for an application of the dominated 
convergence theorem. 


Remark 2.22. Except when p = ~, in the setting of Corollary 2.21 it need not be the 
case that the sequence (f;)n>1 itself is U-almost everywhere convergent to its L?(Q)- 
limit f (see Problems 2.13 and 2.14). 


The inequality in the next result is known as Hélder’s inequality. For p = q = 2 and 
r = | it reduces to a special case of the Cauchy—Schwarz inequality (see Proposition 


3.3). 
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Proposition 2.23 (Hélder’s inequality). Let 1 < p,q,r < © satisfy 5+ 5 = 5. ff € 
L?(Q) and g € L4(Q), then fg € L’(Q) and 


IIfgll < fllpllslla- 


For r = | the condition on p and q reads ; + i = 1; we call such p and g conjugate 
exponents. 


Proof It suffices to prove the inequality for r = 1; the general case follows by applying 
this special case to the functions | f|" and |g|". 

For p = 1, q= and for p =~, q = 1, the first inequality follows by a direct estimate. 
Thus we may assume from now on that | < p,q < o. The inequality is then proved in 
the same way as Minkowski’s inequality, this time using the identity 


tPqP sp 
ab = int (— +). 
t>0\ p qtd 


Remark 2.24. Let 1 < pi,...,py,r < © satisfy a senate ow = +. If fy € L?*(Q) for 
n=1,...,N, then []M_, fy € L’(Q) and 


ie 


This more general version of Hélder’s inequality follows from Proposition 2.23 by an 
easy induction argument. 


N 
S Il fall pn 


n=1 


r 


As an immediate corollary of Hélder’s inequality we have the following result. 
Corollary 2.25. Let 1 < p,q,r < © satisfy ; + i = i Then the mapping 


(fg) fg 


is jointly continuous from L?(Q) x L4(Q) into L’(Q). In particular, if 1 < p,q < 
satisfy ; - 7 = land f, > f in L?(Q), then for all g € L1(Q) we have 


lim | fedu = | fed. 
nro JO Q 


Proof If fn — f in L?(Q) and g, > g in L7(Q), then Hélder’s inequality implies that 
fa8n, In8, £8n, fg belong to L’(Q) and 


\|fn8n — fallr < Wn fall \| fn (Sn g)llr 
< |lfn-fllpllallg +Mlan—8llg 40 asn~, 


where M := sup, ||fn||p- By Proposition D.8, this proves the asserted continuity. 
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A useful special case of Hélder’s inequality concerns the case of a finite measure. If 
L(Q) <eand1<r< p<o, then Hélder’s inequality implies that if f € L?(Q), then 
f €L'(Q) and 

11 
fll < BQ)” PI lp- 
In the case of a probability measure 1 this takes the simpler form || f|| < ||f||p- 

The following result provides a converse to Hélder’s inequality. We formulate it for 
exponents 5 + 7 = 1; as in the proof of Holder’s inequality, this implies a more general 


version for exponents ; + i = a A further variation will be given in Proposition 5.14. 


Proposition 2.26. Let 1 < p,q < ~ satisfy : + 7 = 1. Let (Q,U) be a measure space, 
which is assumed to be 0-finite if p =. A measurable function f belongs to LP (Q) if 
and only if 


fg €L'(Q) and ||fglli < Mlléllq 


for some constant M > 0 and all g € Y, where Y is a dense subspace of L1(Q.). In that 
case we have ||f||p <M. 


Proof The ‘only if’ part is immediate from Hélder’s inequality. To prove the ‘if’ part 
we may assume that f is not identically 0. 


Step 1 — By assumption, the mapping g+> fg is bounded, of norm at most M, as a 
mapping from the dense subspace Y of L7(Q) to L!(Q). Hence by Proposition 1.18 it 
admits a unique extension to a bounded operator, of norm at most M, from L7(Q) to 
L'(Q). Denote this operator by 7. If g, — g in L4(Q) with each g, in Y, then Tg = 
Himy— oo T 8p = liMyp+.0 f¥n With convergence in L'(Q). Using Corollary 2.21 we may 
pass to a subsequence such that g,, — g and fgn, —- Tg p-almost everywhere, and 
therefore 


Tg= im S8n = fg U-almost everywhere. 
This also implies that fg € L'(Q) for all g € L4(Q) and 
fall <Mllgllg, g € LQ). (2.6) 
Step 2 —In this step we prove the proposition for 1 < p < e by showing that 
[iisitau <r, (2.7) 


To this end let @ be a 1-simple function satisfying 0 < @ < || “-almost everywhere, say 
= ee, cjlr, with coefficients cj; € KK and the sets Fj disjoint and of finite measure. 
We first prove that 


I |o|’ du < M?. (2.8) 
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If fo |@|? du < c this inequality trivially holds, so we may assume that fo |@|? du > 0. 
To prove (2.8) in this case, set g := |o|?—! (with g:=1 if p = 1). Then 


[lerau= [ lolean < flfledw = lfelli <Mllalo 2.9) 


For p = | we have ||g\|, = ||1||.. = 1 and (2.8) follows from (2.9). For 1 < p < © we 
have 1 <q < «and 


k 


k 
lallf= Yo les aC) = ¥ le? aC) = ff lola. 


j=l j=l 


Taking gth roots on both sides and substituting the result into (2.9), we obtain 


[oleae <m([joiran)'" 


which is the same as saying that (2.8) holds. 

Now let 0 < @, t | f| “-almost everywhere in (2.8), with each @, a L-simple function. 
Applying the previous inequality to @,, the monotone convergence theorem gives (2.7). 
This proves that f € L?(Q) and || ||» <M. This completes the proof for 1 < p<. 


Step 3 — Suppose next that p = © and (Q,.¥, 1) is o-finite. Suppose, for a contradic- 
tion, that f does not belong to L*(Q). Then for all n = 1,2,... the set A, := {|f| > n} 
has strictly positive measure. If Q = U5) Bj with u(B;) < or all j (such sets exist 
by o-finiteness), then for each n there must "be an index j = jy, such that A, Bj, has 
strictly positive (and finite) measure U,. Then gy := Be MainB;, belongs to L'(Q) and 
has norm one, and we have 


1 
Ign =— fo [flaw > =allgnl 
Un JA,OB; 


This contradicts (2.6). 


Remark 2.27. The argument of Step 3 proves, more generally, that if (Q,.4%,u) is a 
o-finite measure space and 1 < p < ©, then a measurable function f belongs to L*(Q 
if and only if fg € L?(Q) and If ell < M|\g||p for some constant M > 0 and all g € Y, 
where Y is a dense subspace of L?(Q); in that case we have || f||.. < M. 


2.3.b Approximation by Mollification 


It is generally difficult to handle L?-functions directly. There are two ways of dealing 
with this problem: by approximation it often suffices to consider functions that are easier 
to deal with, and by interpolation one can reduce matters to exponents that are easier to 
deal with. The present section is devoted to approximation techniques; interpolation is 
treated in Section 5.7. 
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We begin by proving that the u-simple functions are dense in L?(Q) for 1 < p< ©. 
Recall from Definition 1.49 that a U-simple function is a simple function supported on 
sets of finite U-measure. 


Proposition 2.28 (Approximation by U-simple functions). For 1 < p < ©, the u-simple 
functions are dense in L?(Q). The same result holds for L®(Q.) if U(Q) <. 


Proof Fix a function f € L?(Q). 
First let 1 < p < >. By dominated convergence we have 


lim 161 <isent =F 


n—oo 


in L?(Q). Moreover, 


>1/n}=/1 au< fia Pdu <n? || f\|2 <0. 
HAE Unk =f dapsyde < f upsaylefl’ au <n” 


We may therefore assume that f is bounded and y is a finite measure. By considering 
real and imaginary parts separately we may also assume that f is real-valued. Under 
these assumptions we have f, > f in L?(Q), where 


f= VU p-tcpeurye-ys2 
jeZ 
are [t-simple functions (by the boundedness of f these sums have only finitely many 
nonzero contributions). 
If f € L*(Q) with u(Q) <, the functions f; defined above are j1-simple and ap- 
proximate f uniformly. 


More interesting is the fact that if D is an open subset of R%, then the vector space 
C>(D) of all compactly supported smooth functions f : D > K is dense in L?(D) for 
every 1 < p < ©. Here, and in what follows, the support supp(f) of a continuous func- 
tion f : D — K is defined as the complement of the largest open set U such that f = 0 
on DOU or, equivalently, as the closure of the set {x € D: f(x) 4 O}. 


Proposition 2.29 (Approximation by compactly supported smooth functions). Let 1 < 
p <and let D C R¢ be open. Then C?(D) is dense in L?(D). 


Proof For f € L?(D) we have limy. || f — 1g(0:n) f|| p = 0 by dominated convergence, 
so there is no loss of generality in assuming that D is bounded. Also, by Proposition 
2.28, every f € L?(D) can be approximated by simple functions supported on D. Hence 
it suffices to prove that every simple function supported on a bounded open set D can be 
approximated in L?(D) by functions in C3(D). By linearity and the triangle inequality, 
it even suffices to approximate indicator functions of the form 1g for Borel sets B C D. 

Given € > 0, choose an open set U C D and a closed set F C B such that F CB CU 
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and |U \ F| < €; this is possible by the regularity of the Lebesgue measure on D (Propo- 
sition E.16). Let @ € C3°(D) satisfy 0 < @ < 1 pointwise, @ = 1 on F, and @ = 0 outside 
U. As outlined in Problem 2.9, the existence of such functions can be demonstrated by 
elementary calculus arguments. Then 


9 —Iellfn) = f IPlyel’de<|U\Fl<e. 


Since the choice of € > 0 was arbitrary, this completes the proof. 


The corresponding result for p =~ is wrong: if D is nonempty and open, then C,(D), 
the vector space of all compactly supported continuous functions f : D > K, fails to be 
dense in L*(D). Indeed, if D’ is a nonempty open set properly contained in D, then ||,f — 
1p'||:=(p) > 5 for every f € C.(D) whose support intersects both D’ and its complement. 


The separability of the spaces C(B(0;n)) implies: 
Corollary 2.30. Let 1 < p < and let D C R¢ be open. Then L?(D) is separable. 


Remark 2.31. Since finite Borel measures on metric spaces are regular (Proposition 
E.16), using Urysohn’s lemma (Proposition C.10) the same argument proves that if [1 is 
a finite Borel measure on a compact metric space K, then C(K) is dense in L?(K, W) for 
all<p<o, 

Combining this observation with Proposition 2.8, as a corollary we obtain that, under 
these assumptions, L?(K, W) is separable for all 1 < p < ©. 


As an application of Proposition 2.29 we prove an L?-continuity result for translation. 
Proposition 2.32 (Continuity of translation). Let f € L?(R¢) with 1 < p<. Then 


lim |[f(- +h) — fC )Ilp = 0. 

|h|0 

Proof Define, for h € R4@ and x € R4, (tf) (x) := f(x +h), that is, tf is the translate 

of f over h. Clearly, || tr fl p = ||f|lp- 

First consider a function f € C.(R“). Such a function is uniformly continuous, so 

given an € > 0, we may choose 6 > 0 such that |x — x’| < 6 implies | f(x) — f(@’)| <e. 

Hence if |h| < 6, then for all x € R¢ we have |t, f(x) — f(x)| < €. Choose r > 0 large 

enough such that the support of f is contained in the rectangle (—r,r)“. If |h| < 6 is so 
small that the support of ),f is also contained in (—r,r)4, then 


Ins— Fle = f 


v 


lfe+h) = fla)lPar<e” | dx=er(2r)4 


nr) (—rr)4 


This proves that lim),)_,9 || tf — f||p = 0 for all f € C. (R4). 
Now let f € L?(R®) be arbitrary. Since C.(IR“) is dense in L?(R“) by Proposition 
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2.29, we can find g € C,(IR“) such that || f — g||, < €. Choose 7 > 0 so small that |h| < 
implies || tg —g||p < €; this is possible by what we just proved. Then, for |h| < 7, 


IItaf—Sllp S tif - T8llp + %8 — Sllp +118 —fllp <E +e + € = 3e, 


noting that || tf — Tgllp = ll(F— 8)\lp = Ilf — slip <€- 


We now turn to an approximation technique based on convolution. It relies on a fun- 
damental inequality, known as Young’s inequality. 


Proposition 2.33 (Young). Let 1 < p,q,r < © satisfy ; + ; =1+14, and let f € L?(R“) 
and g € L4(R¢). Then: 


(1) for almost all x € R¢ the function y + f(x—y)g(y) is integrable; 
(2) the function f * g:.R4 > K, defined for almost all x € R4 by 


(fF * 8)( i= fs f(x—y)s(y) dy, 
belongs to L'(R¢) and we have 


If * ally < Ifllpllslla- 


Moreover we have f *g = g» f in L'(R¢); and if } +} = 1+ 4, then for all h€ L’(R4) 
we have (f *g)*h= f *(g*h) in L'(R¢). 


The most important special case corresponds to the choices g = 1 and r = p, for 
which the proof below simplifies considerably. 


Proof The identity ; + 7 =1+ + implies + a ao ae = 1 with r > p,q. Hence, by 
elementary rewriting and Hélder’s inequality (for three functions, see Remark 2.24), for 
all x € R@ we have 


I. |f(x—y)g(y)| dy 
i seni deo 
< |[(lFe— Pleo "I 


= (1)-(1/)- (IJ). 


(x—-)|¢-P)/r (.)[¢-/" 


qr 
ia 


Now 
H) = ee Fey) Pla) I4ay) us 
~ (., if(e—y)|? dy) ae Thala 
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wn =(f[ Ieonltar)” Me gga 
Putting things together and using Fubini’s theorem, it follows that 
[,\exseorar< ily iss, flees Pleoyltavax 
= Fly Pelle * f leorl? [Ye y)I? deay = [Llp 


This implies the first assertion as well as the second. 
The identity f * g = gx f follows by a change of variables and (f * g) *h = f *(g*h) 
by Fubini’s theorem. 


The function f * g is called the convolution of 
f and g. 


Proposition 2.34 (Approximation by mollifica- 
tion). Let f € L?(R¢) with 1< p<. Letoe 
L'(R¢) satisfy a (x) dx = 1. For € > 0 define 


be (x) :=e-46(e7!x), xe R4. 
Then 
lim de*/—fllp =0. 


Proof We proceed in three steps. 


Step 1 — First assume that @ and f belong to 
C.(R“). Since fea be(y) dy = 1, for all x € R4 we 
have 


bef) £0) = | del)LFe-») — FeOldy = [ @O)LFle— en) — Fla) dy. 


Taking L?(IR“) norms on both sides and using that the L?(IR)-valued function y > 
o(y)[f(- — €y) — f(-)], which is continuous by Proposition 2.32 and compactly sup- 
ported (and hence supported on some large enough closed cube), is Riemann integrable, 
by Proposition 1.44 we obtain 


IGe*f—Filo=|] [eo -2)- Fa], < f loo IIe») - FOIloa 


Since || f(- — €y) — f(-)||p < 2||f |p uniformly in € and y, and @ € L'(R¢), by dominated 
convergence it suffices to show that f(-— ey) + f(-) in L?(IR¢) for each fixed y. This 
again follows from Proposition 2.32. This completes the proof for functions f and @ in 
C.(R*). 

Step 2 — Still assuming that @ € C,(IR“), we next extend the result to general f € 


Maurice Fréchet, 1878-1973 
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L?(R¢). Fix € > 0 and choose g € C,(R“) such that || f — ||, < €. This is possible by 
Proposition 2.29. By Young’s inequality and the identity ||@s||1 = ||@||1, for any 6 > 0 
we have 


los *f—fllp 


Ils * f— 5 * 8llp + l165 *8 — 8llv + \le— fll 
| 


< 
< I9sllillf—sllpll +195 *8—sllp +e < Ellolli + [los * 3 —gllp +€- 


Letting 6 | 0, the result of Step 1 implies that 


timsup||@s * f—f\lp < e((olli + 1). 
10 


Since € > 0 was arbitrary, this proves that limg|9 ||s * f — f||p = 0. 


Step 3 — We now pass to the general case where @ € L!(R“) satisfies [pa (x) dx = 1. 
In order to apply the result of the preceding step, choose a function y € C,(R“) such 
that ||@ — wl|; < € and feu w(y)dy = 1. Such a function exists by Proposition 2.29. 
Then, by Young’s inequality and the result of Step 2 applied to y, 


los * f —fllp < los * f — Ws * f lp + lWs*f—Sfllp 


| 
Ilo — Wallallflly + lwo *f—Fillp S Ellfllp + llWs *f—fillp- 


Letting 6 | 0 using the result of Step 2, it follows that 


< 
< 


lim sup ||@5 * f — fllp < Ell fllp- 
810 


Since € > 0 was arbitrary, this proves that lims9 ||5 + f — f||p = 0. 


2.3.c The Fréchet-Kolmogorov Compactness Theorem 


In this section we prove a characterisation of relatively compact sets in L? (IR). It will be 
used in Chapter 11 to prove the Rellich-Kondrachov theorem on compact embeddings 
of Sobolev spaces. 

Recall the notation 7), f for the translate of a function f over h, 


Taf (x) = fx+h). 


Theorem 2.35 (Fréchet-Kolmogorov). Let 1 < p<. A subset S of L? (R®) is relatively 
compact if and only if it satisfies the following two conditions: 


(i) lim sup||t,f — f||p = 0; 
|h|>0 fes 


(i) lim sup [ f(x)|? dx = 0. 
PP Fes rRioie)| I 
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Proof ‘Tf’: Let us begin by proving that (i) and (ii) together imply that the set S is 
bounded. Choose r > 0 such that sup es || tf — f||p < 1 for all hh € R¢ with |h| <r, and 
choose R > 0 such that sup yes Jogcry |f(x)|? dx < 1. Fix he R¢ with |A| =r. For all 
f €S and x € R¢ we have 


eur) flip S acer) Ff - Tf \lp Tv Wace) tS lp 
= ||1ary)(f- tA lp + I Lao—aseyfllp < 1+ |lLao—aeyfllp- 


Hence, by induction, 


[1e0o:r)f llp < N + |Ve—wa:ry fll p- 
Choose N > 1 such that Nr = N|h| > 2R. Then B(—Nh;R) C CB(0;R) and 


Ilfllp = [Lacey f lp + lLeaco.ryfllp <N + |lLa—wa:ey lly + eaco:r) flip < N +2. 
(2.10) 
This proves that S is bounded. 
Let us now prove that if S satisfies (i) and (ii), then it is relatively compact. Fix € > 0 
and choose Re > 0 such that sup es Jopco.r-) |f (2) |? dx < €?. Set Be := B(0;Re) and 


Se := {1p f : f € S}. 
If f € S, then 
If — 12. fllp = Wea. fllp < € (2.11) 


and therefore S C Se + By(0;€), where B,(0;€) is the open ball in L?(R¢) with radius 
€ centred at 0. 

Choose rg > 0 such that sup res || tf — f||p < € for all fe S andhe R¢ with |h| < re. 
For such h, (2.11) implies that for all f € S we have 


I ta(Laef) — Ape fly < IIt4 bef — Ally + lltuf—fllp +ILf —Laefllp <3€. (2.12) 
Let 0 < @ € C.(B(0;re)) satisfy fp (x) dx = 1 and set 
S :={*g: gE Se}. 
For g € Se, the estimate (2.12) implies 


lo*e—ello=|| [,e0V(e--»)-2()) I], 


< [9orle--y) -lledy= f POI») ~eOllpdy <3e. 


This shows that Sz C S* + B,(0;3€) and hence 


S.C S;+B,(0;€) C S®+B,(0;4e). 
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If we can prove that S® is relatively compact, it follows from Proposition 1.40 that S is 
relatively compact. 
Every h € S® is supported in B(0;Re +e). We claim that every h € S® is continuous 
and that the set S®, as subset of C(B(0;Re + re)), is equicontinuous and bounded. 
Lethe S*, sayh= xg with g € Se, say g = 12, f with f € S. By uniform continuity, 
given n > 0 there exists 0 < 6 < 1 such that for all x,x’ € R@ with |x —.x’| < 6 we have 
| (x) — 6(x’)| < 7. Hence, for all x,x’ € R¢ with |x —x’| < 6, 


a(x) A) < foe») - 00 -yIlle0)] 
= |, 16-9) (0 -sNlla)la 


<nf leoiidv=n f, ifoylay<mlBel!4@ +2), 


applying Holder’s inequality and (2.10) in the last step. This proves the continuity of h. 
Since the estimate is uniform with respect to h € S®, it also proves the equicontinuity of 
Se. 

Boundedness of S© follows from the boundedness of S. Indeed, if h = @ * g € S® with 
g € Sg, then by Hélder’s inequality with ; + 5 =1, 


ala) < f OO)Ie@—y) lav < lopli 


By the Arzela—Ascoli theorem, S® is relatively compact in C(B(0; Re + re)). Since the 
natural inclusion mapping from C(B(0;Re +re)) into L?(IR4) is bounded by Hélder’s 
inequality, S® is relatively compact as a subset of L?(R¢). 


‘Only if’: If S is relatively compact in L?(IR®), then (i) and (ii) follows from Propo- 
sition 1.42 applied to the operators f ++ tf for |h| | 0 and f ++ Ippo.) f for p > &, 
respectively. 


We have the following immediate corollary for bounded domains. 


Corollary 2.36. Let 1 < p < and let D be a bounded open subset of R4. A subset S 
of LP (D) is relatively compact if and only if 


lim sup ||tif— f||p = 9. 
|h|0 fes 


Here we identify functions in L?(D) with their zero extensions in L?(R“). 
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2.3.d The Lebesgue Differentiation Theorem 


By Li,.(R“) we denote the vector space of functions f : R¢ > K that are locally in- 
tegrable, that is, integrable on every compact subset of R“, identifying two such func- 
tions when they are equal almost everywhere. The aim of this section is to prove the 
Lebesgue differentiation theorem, which says that if f € EER); then at almost every 
point x € IR one has 


lim 5 vo)- (x)|dy =0, 


\BI 30 [BI 


the limit being taken along the balls B in R¢ containing x, letting |B| denote the Lebesgue 
measure of a measurable set B. The proof of this theorem is based on the following 
lemma. For balls B = B(x;r) in R@ and real numbers A > 0 we set AB := B(x;Ar). 


Lemma 2.37 (Vitali covering lemma). Every finite collection Z of open balls in R¢ has 
a subcollection By of pairwise disjoint balls such that each ball B € & is contained in 
3Bo for some ball Bo € Bo. 


Proof We proceed by induction on the number n of balls in &. For n = | the lemma is 
trivial, for we can take By = Z. Suppose the claim has been verified for every collection 
of n balls, and let Z be a collection of n+ | balls. Let YM’ := A\ {Bo}, where Bo is a ball 
in ¥ of minimal radius. By the induction assumption there is a subcollection 4 C A 
of pairwise disjoint balls such that each ball B € & is contained in 3B’ for some ball 
B' € ZB. We now distinguish two cases. 


Case 1. If Bo is disjoint from each ball B’ € A, then the subcollection By := BU 
{Bo} has the required properties. 


Case 2. If Bo intersects a ball B’ € Ap, then the radius of B’ is at least as large as that 
of Bo, from which it follows that By C 3B’. The subcollection Zo := By then has the 
required properties. 


For f € L,.(R“) we define the Hardy—Littlewood maximal function Mf :R4 + (0,00 
by 


1 
MF) = supra f,L/o)Ids 


where the supremum is taken over all balls B containing x. Since the supremum in the 
definition of M f(x) can be realised by using only a fixed countable collection of balls, 
Mf is a measurable function. 


Theorem 2.38 (Hardy—Littlewood maximal theorem). For all f € L'(R¢) andt > 0 we 
have the weak L'-bound 


t\{Mf >t}| <34||flli- 
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Moreover, for all 1 < p < & there exists a constant Cq,p > 0 such that for all f € LP (R?) 
we have Mf € L?(R¢) and 


IMP lp <Capllflly- 
Proof We begin with the proof of the first assertion. By me definition of Mf, for every 
x€ {Mf >t} there exists a ball B containing x such that 7 plelf|>t-IEK C {Mf >t} 


is a compact subset, it can be covered by a finite collection Z of such balls. Let Bp be 
a disjoint subcollection of this cover provided by the Vitali covering lemma. Then, 


i<| Ul <| UJ 38[= 2 sal< Fe © [vores ith. 


This being true for all compact sets K contained in the open set {Mf > tr}, the first 
assertion follows. 

For 1 < p < ~ the second assertion follows from the first by using the integration by 
parts identity 


| le(s)lae=p fo? "Iflel > splat 2.13) 


for g € L?(IR“) as follows. For any f € L?(IR“) and t > 0 the function f;(x) := Ly srt 
belongs to L?(IR¢) and satisfies the pointwise bound 


MF < sup [ MiOdiay-+sup re [MyneaafO)ldy <M f+1/2. 
which implies 
{Mf >t} C{Mf, >1/2}. 
Hence, by the first part of the theorem, 
2.34 


{Mf > th] <|{Mf > t/2}| < < Fhe = [f(x)|dx. (2.14) 
{| fl>e/2} 


By (2.13), (2.14), and Fubini’s theorem, 
©0 2.3¢ 
Mf (x)|Pdx < | p= f dx } dt 
Limroorarsp for fi ire)lay) 
. ara) 
=3.2p f Ircol f 1-2 dt dx 
Rd 0 
34.2 
=F fale tar=ch, f lr@lrar, 
p—1 Jra 
where Cy.) = 2( Ha )W/e, 


For p = ~ the second assertion follows trivially from the pointwise inequality Mf < 
|| f ||oo, with constant Cy oo = 1. 
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Inspection of this proof reveals that the derivation of the L?-bound for M/f in the 
second part of the theorem does not use any properties of this function other than the 
weak L!-bound contained in the first part of the theorem. This observation lies at the 
basis of the Marcinkiewicz interpolation theorem in Chapter 5 (see Theorem 5.47). 

As a corollary to Theorem 2.38 we have the following fundamental result. 


Theorem 2.39 (Lebesgue differentiation theorem). If f € L},,(IR“), then for almost all 
x € R¢ we have 


ip im yf, "0 x)|dy = 0. (2.15) 


The correct way of interpreting this theorem is as follows. For every pointwise defined 
locally integrable function fon R¢ the limit in (2.15) (with f replaced by f) exists for 
almost all x € R, say on a Borel set Q C R@ such that |R¢\ Q| = 0. If both f; and fy are 
pointwise representatives, the symmetric difference of corresponding sets Q; and Q» 
has measure 0. 

The set of all points x € IR? for which (2.15) holds is called a the set of Lebesgue 
points of f. As just explained, this set is uniquely determined only up to a set om mea- 
sure 0. 


Proof A point x € R@ is a Lebesgue point of f if and only if it is a Lebesgue point 
of ly f, for any bounded open set U containing x. Hence, upon replacing f by ly f if 
necessary, it suffices to prove the theorem under the stronger assumption f € L'(R¢). 
For all x € R@ let 
Nf (x) := limsup — = fr | f(y) x)| dy. 
B>x |B | 
|B|>0 
We wish to prove that N f(x) = 0 for almost all x € R“. For this purpose it suffices to 
show that |{Nf > €}| = 0 for any fixed € > 0. 
For any fixed 6 > 0, Proposition 2.29 provides us with a function g € C.(R“) such 
that || f — g||) < 6. Then, 


Nf <N(f—g)+Ng<M(f—g)+|f—al+0, 


using the pointwise inequality Nh(x) < Mh(x) + |h(x)| and the continuity of g, which 
implies Ng(x) = 0. Therefore, by Theorem 2.38, 


> eS HCP 2) > 23H al > 
d 
< S p— all + =f —alh < 264406. 


Since 6 > 0 was arbitrary it follows that |{Nf > e}| =0 
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2.4 Spaces of Measures 


In this section we introduce the space M(Q) of K-valued measures on a given measut- 
able space (Q,.¥) and discuss some of its properties. From the functional analytic point 
of view, the importance of this space resides in the fact that M(Q) is a vector space ina 
natural way by setting 


(cu)(F):=cu(F), (H+V)(F):=H(F)+V(F), Fe, 


and that it is a Banach space with respect to the variation norm introduced in Definition 
2.42 (see Theorem 2.44). 


2.4.a The Banach Space M(Q) 


In what follows we fix a measurable space (Q, F ). 


Definition 2.40 (K-valued measures). A K-valued measure on (Q,.#) is a mapping 
uu: ¥ —> K with the following properties: 


(i) U(@) =; 


(ii) for all disjoint *-measurable sets F,, Fy,... we have 


u( LU Fn) = Ln). 


n>1 


Remark 2.41. An ordinary measure (in the sense of Definition E.3) is a real-valued 
measure if and only if it is finite. 


The terms ‘real-valued measure’ and ‘complex-valued measure’ are often abbreviated 
to ‘real measure’ and ‘complex measure’. 


Definition 2.42 (Variation). Let 1p be a K-valued measure on (Q, ¥). The variation of 
Ll on the set F € F is defined by 


HI(F):= sup )° |u (A), 


AEF F ACA 


where Fr denotes the set of all finite collections of pairwise disjoint ¥-measurable 
subsets of F. 


It is immediate to verify that |cuu| = |c||u| for all c € K and |u+v|(F) < |u|(F) + 
|v|(F) for all F € F. If u takes values in [0,cc), then |u| = p. 


Proposition 2.43. If u is a K-valued measure, then \uU| is a finite measure. 
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Proof We proceed in two steps. 


Step 1 — We prove that |1| is a measure. It is clear that |u|(@) = 0. Let (Fi)ns1 be a 
sequence of pairwise disjoint measurable sets and let F be their union. We must prove 
that |u|(F) = Eqs [MI (Fa): 

If &, € Fr, is a finite collection of pairwise disjoint measurable subsets of F,,, then 
for every N > 1 the union ees @y is a finite collection of pairwise disjoint measurable 
subsets of F and therefore 


N 
YL DY lA) <|uI(F). 


Taking the supremum over all %, € Fy, it follows that Y\_, |u|(Fn) < |u|(F). This 
being true for all N > 1 we conclude that 


YL lel) < HIF). 


n>1 


In the converse direction suppose that the measurable subsets A;,...,A, of F are dis- 
joint. With Fin := Aj; Fn we have 


k k k 
2 |H(As) is Dla Fin)| = h dle (Fin)| < ¥ luli). 
J= n>1 j=1 


n>1 n>1 


Taking the supremum over all finite families of pairwise disjoint measurable subsets of 
F we obtain 


mM\(F) < Yo |e |(Fn)- 


n>1 


Step 2 — We prove that the measure || is finite, that is, |{u|(Q) < °. By considering 
real and imaginary parts, it suffices to do this in the real-valued case. 

If a finite set {r1,...,ry} of real numbers is given, then either the positive or the 
negative numbers in this set (or both) contribute at least half to the sum a4 |rn|. Enu- 
merating this set (or one of them) as r,,,...,7n,, we thus have 


[Lr] > > 53) ral: (2.16) 


Suppose, for a contradiction, that some measurable set F satisfies | 1|(F') = oe. Choose 
disjoint measurable subsets F\,..., Fy of F such that 


N 


Y len) > 20 + |H(F)))- 


n=1 
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By (2.16) the union F’ of a suitable subcollection of the F; satisfies 


For F” := F \ F’ we then have 


M(P)| > [lM |e (F')|| 2 1. 


Thus if a set F € F satisfies |u|(F) =o, there is a disjoint decomposition F = F’ UF” 
with u(F’) > 1 and u(F”) > 1. Since |u| is a measure, at least one of the numbers 
| |(A’) and |w|(A”) equals co. We take one of them and continue applying what we 
just proved inductively. This produces a sequence of pairwise disjoint measurable sets 
G,,G2,..., each of which satisfies u(G;,) > 1. Let G be their union. Since pr is a K- 
valued measure we have U(G) = Yy5) (Gx). This sum cannot converge since its terms 
fail to converge to 0. This is the required contradiction. 


Theorem 2.44 (Completeness). Endowed with the variation norm 


[HI] == |H|(Q), 
M(Q) is a Banach space. 


Proof We leave it as an exercise to prove that |j1|(Q) defines a norm. 
To prove completeness, let (Un )n>1 be a Cauchy sequence in M(Q). For all F € # 


[Hn (F) — Mm (F)| = |(Hn — Him) F)| S [Hn — in| (F) S || Hn — ml, 


proving that the sequence (Un(F'))n>1 is Cauchy in K. Let (fF) denote its limit. We 
wish to show that the resulting mapping w : % — K is a K-valued measure and that 
limps. Un = LL with respect to the norm of M(Q). 

It is clear that U(@) = limy+.0 Un(@) = 0. Suppose now that (Fin) m>1 is a sequence of 
pairwise disjoint measurable sets and let F := U,, 51 Fm. Given € > 0, choose N > 1 so 
large that ||; — Ux|| < € for all j,k > N. Since Uy is countably additive we may choose 
N’ > 1 so large that |uy(F) —_, un (Fin)| < € for all M > N’. Then, for M > N’, 


wry Er =m 


m=1 


poco sup (in — Ev)( U Fn)| 


n2N+l m>M+1 


M 
< |uv(F)— YS uv(Fn)| + sup [Ibn — ull < 2€ 


m=1 n>N+1 


Since € > 0 was arbitrary, this proves that ¥\,,5; U(Fm) = U(F). 
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Finally, if F,,...,F% are disjoint and measurable, then for all m > N we have 


k k 
YN — Hm) (FDI = im Ye — Lm) (F;)| 
I= j=l 


< lim sup [ly — pm |(@) = limsup || Ln — Hl < € 
n-oo 


n—-oo 


Taking the supremum over finite disjoint families of measurable sets, we find that || u — 
Mm|| < € for all m > N. This proves the required convergence. 


If is a complex measure on (Q,.¥), then 


(Rew)(F) =Re(u(F)),  (Imp)(F) :=Im(u(F)), 


define real measures on (Q,.F) and we have pp: = Rew +ilmu. The next result shows 
that real measures allow decompositions into positive and negative parts: 


Theorem 2.45 (Hahn—Jordan decomposition). If 1 is a real measure on a measurable 
space (QF), then 


ut (F):=sup{u(A): Ae ¥, ACF}, 
uw (F):=—inf{u(A): Ae F, ACF} 
are finite measures on (Q, #) and 
w=ur—w, |p)=er te. 


The measures * and \.~ are supported on disjoint sets, in the sense that there exists a 
disjoint decomposition Q = Q* UQ~ with Q* € FY such that for all F € F we have 


wt (F)=H(FNQ*), wo (F)=—H(FNQ). 


If v1 and V2 are finite measures on (Q,-¥) such that L = V1 — V2, then for all F © F 
we have 


Vi(F)2u(F), vo(F) 2M (F). 


The decomposition p = * — yw” for real measures p is called the Jordan decompo- 
sition; the existence of a corresponding decomposition Q = Q* UQ~ for their supports 
is often referred to as the Hahn decomposition theorem. 


Proof We begin with the construction of the sets QT and Q~. Let us call a set F € F 
positive (resp. negative) if for allA € ¥ with A C F we have we ) > 0 (resp. (A) < 0). 
We use the notation F' > 0 (resp. F < 0) to express that F € ¥ and F is positive (resp. 
negative). 

Finite and countable unions of positive sets are positive. Indeed, suppose that the sets 
F,, n> I, are positive. Set G; := F, and G, := F;, \UjE ale for n > 2. Then U5) Fn = 
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Uns1 Gn. If A € F is contained in this union, the countable additivity of 1 implies 
H(A) =YnsiU(GiMA) > 0, keeping in mind that G, MA C F,, and F,, > 0. 
Let 
M := sup (F) 
F>0 

and note M < ce. Choose positive sets F,, n > 1, such that lim,_,.. (F,) = M. By the 
observation just made we may assume that Fj C Fy C ... By the same observation, 
Q* := U,s1 Fr is positive and therefore u(Q*) < M. The positivity of Q* also im- 
plies u(F,) < w(Q*), and therefore M = limy,..U(F,) < U(Q*). We have shown that 
u(QT) =M. 

We show next that Q~ := CQ7* is a negative set. Suppose, for a contradiction, that this 
were false. Then Q~ contains subset Ag € -F with U(Ao) > 0. If Ao were positive, then 
so would be Q* UAo, but then u(Q* UAo) =M+p(Ao) > M contradicts the choice of 
M. It follows that there exists a smallest integer k, with the property that Ap contains a 
subset A; € F with u(A1) < ce Since (Ao \A1) = U(Ao) — U(A1) > 0 we can repeat 
this construction to find the smallest integer ky with the property that Ao \ Ai contains a 
subset Az € ¥ with u(A2) < — me Continuing this way we obtain a sequence of pairwise 
disjoint sets (A,)n>1, all contained in Ag, such that u(A,) < — a for alln > 1. We must 
have lim,5..kn = 0, since otherwise the union A = U,,5; An would satisfy u(A) = —e. 

Let B := Ag \A. Then p(B) = (Ao) — H(A) > 0 and B > 0: for if we had C € ¥ with 
CC Band u(C) <0, then u(C) < 4 for some integer k. The existence of such a set C 
contradicts the maximality of the k, for large enough n. The set Q* UB is positive and 
satisfies u(Qt UB) =M+u(B) > M, contradicting the choice of M. We conclude that 
Q~ is negative. 

We have shown that for all F € ¥ we have 


M(FNQt)>0, w(FNQ) <0. 


We may thus define positive measures U4 by 
W(F):=E(FNQ*), w(F):= —W(FNQ). 


It is clear that uw = U+ — pL. 
Since (A) = 4(A) — H_(A) < Wy (A) = U(ANQ*) we see that 


ut (F) =sup{u(A): Ae ¥, ACF} < sup {u(ANQt): AE F, ACF}. 
The converse inequality trivially holds, for A C F implies ANQ* C F. Hence we have 
equality, and then (A) = u(ANQ*) implies 
ut (F) =sup{u(ANQ*): Ae F, ACF} 
=sup{u4(A): AE F,ACF}=p,(F). 
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The identity 4~ = p_ is proved in the same way. The countable additivity of u* and 
LL” is an immediate consequence. 

Next, |41|(F) = [M+ —w-|(F) < |w*|(F) + |uo |(F) = Wt (F) + a (F). In the con- 
verse direction, write F = F* UF~ with F* := F 0Q*. Then the positivity of F* and 
the negativity of F~ imply 


MIF) > MF) + WF) =e (F*) -w(F-) =e (F) +H (FP). 


Finally, if vj and v2 are finite measures such that W = vj — v2, 


Vi (F) > vi (FNQ*) > vi (FNQ*) — vo(FNQ*) = (FNQ*) = pt (F). 


The proof that v2(F) > uw (F) is similar. 


2.4.b The Radon-Nikodym Theorem 


If f : Q > K is integrable with respect to the measure LU, then the K-valued measure 
V(F) := if fdu, FEF, 
F 


is absolutely continuous with respect to , that is, u(F) = 0 implies v(F) = 0. The 
following theorem provides a converse under a O-finiteness assumption. 


Theorem 2.46 (Radon—Nikodym). Let (Q,.F,u) be a o-finite measure space. If the 
measure V : # — K is absolutely continuous with respect to U, then there exists a 
unique g € L'(Q,W) such that 


VF) = [ eau, Fe F. 


Proof Uniqueness being clear, the proof is devoted to proving existence. By consid- 
ering real and imaginary parts separately it suffices to consider the case of real scalars. 
Then, decomposing v into positive and negative parts via the Jordan decomposition, it 
suffices to consider the case where v is a finite nonnegative measure. 

Consider the set 


S:= {feL(Q,u) é 20, | feu < v(F) forall Fe Fh. 
F 
Then 0 € S, so S is nonempty. Let 


M:=sup | fdu. 
fEs JQ 
For all f € S we have fo fdu < v(Q) and therefore M < v(Q) <&. 


Step 1 — In this step we prove that there exists a function g € S for which the supre- 
mum in the definition of M is attained. Let (f,)n>1 be a sequence in S with the property 
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that limps. fo fndpt = M. Set gn := fi V--- V fn. Any set F € F can be written as a 
disjoint union of sets F, Sed FO” € # such that g; = fj on a and therefore 


Jisau=¥ [., fidu< YF") =veF). 
a pS ee j=l 


It follows that g, € S. The sequence (g,)n>1 is nondecreasing and therefore its point- 
wise limit g := limy—oo 8n is well defined as a [0,°°]-valued function. By the monotone 
convergence theorem, for all F € F we have 


[isdu= jim fend < v(F) (2.17) 
F no JF 


and therefore g takes finite values j1-almost everywhere and belongs to S. Moreover, 


M=lim | f,du < lim gndu =f edu <M 
Q Q Q 


noo N—yoo 


and therefore equality holds at all places. This proves that the supremum in the definition 
of M is attained by the function g € S. 


Step 2 — Under the additional assumption that py is a finite measure, we show next 
that g has the required properties. To this end we must show that 7 = 0, where the finite 
measure 1) is defined by 


n(F) := vF)~ | eau, FES. 


Assume, for a contradiction, that 7(Q) > 0. Consider the real-valued measures 


eee n>. 
n 


(It is here that we use the assumption that y is finite; without this assumption 7, would 
not be a real-valued measure.) For each n > 1 we decompose Q = Q; UQ;, with respect 
to Nn as in Theorem 2.45, and set D7 := (},51 Q;. If we had n(Q” ) > 0, then for large 
enough n > | we have 


0< (2) —“w(2-) = m(2-) <0 


and therefore, upon letting n — o, we obtain 7(Q~ ) = 0. This contradiction proves that 
7(Q-) = 0. If we had n(Q1) = 0 for all n > 1 it would follow that n(Q) = 7(Q;) 
for all n > 1, and therefore 7(Q) = n(Q~) = 0. Having assumed that 7(Q) > 0, we 
conclude that 9 (Q;') > 0 for some n > 1. If F € F isa subset of Q;, then 


n(F) — =u (F) = mal F) = ta FOS) = nF(F) > 0 
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and therefore n(F’) > : L(F). Letting h:= g+ ilo: we obtain, for arbitrary F € F, 


[row = [causes < [ sau +n(Fn gy) 
F F n F 


= gdu+v(F QS) 
FNQ; 


< V(FNQ,)+V(FNQ*) =Vvi(F), 


using (2.17) in the last inequality. This proves that h € S. Then, by the definition of M, 
1 4: 1 4 
M> | hdu =F gdu + —u(QN Qt) =M + —p (OF). 
Q Q n n 


Since M < ©, this is only possible if w(Q;') = 0. By (2.17) and absolute continuity 
this would imply fox gdu < v(Q,*) = 0 and therefore, by the definition of 7 and the 
nonnegativity of g, 
0<n(7)=- [ gdu <0. 
Q7 
This contradiction concludes the proof that n(Q) = 0. 


Step 3 — For finite measures y the theorem has now been proved. It remains to extend 
the result to the o-finite case. Again it suffices to consider the case where v is a finite 
nonnegative measure. 

Write Q = Uys; Qn, where the sets Q, € F are disjoint and satisfy w(Q,) < . 
Define the nonnegative function g on Q by 


g(@) := gn(@) for@eEQ,, nF, 


where the nonnegative functions gn, € L'(Q,,Ulo,) are given by Step 2 applied to Q,, 
that is, 


V(FNQ,) = Jing, sue! = [4 3 Fe F. 
By additivity, this implies 
N 
PEO) =e hou 0,8 Fe F. 


Letting N — oo, by monotone convergence we obtain 


vir) = [ edu. FEF. 
F 


Taking F = Q and using that v is finite, we see that g is integrable with respect to v, 
that is, we have g € L'(Q,v). The function g has the required properties. 
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An alternative proof of the Radon—Nikodym theorem, based on Hilbert space meth- 
ods, is outlined in Problem 3.22. 


Example 2.47. If f : Q — K is integrable with respect to y, then 


V(F) = | fan, Fe fF, 


defines a K-valued measure and for all F € ¥ we have 


IvF) = f Urlau. 
F 
If f is real-valued, then v is a real measure and 
v~(F) = | f- du. 
F 


To prove the first assertion, let A1,...,A, € F be disjoint subsets of F. Then 


Lranisy f ileus filo 


which gives the upper bound ‘<’. To prove the lower bound ‘>’, we use Proposition F.1 
to choose simple functions g, : Q— K such that g, > Ip f and 0 < |gn| < 1r|f], say 
&n= yun 1 1 (n) With AW”), anh Ax € F disjoint subsets of F. Then 

j 


J=l"i “A 
Vo lm) gl) gla) 
[shay = firm 9 fel? w( AY) < timsup J Iv(A"")] < [VIG 
j=l nyo fT] 


To prove the second assertion we note that the sets Qt = {f > 0} and Q” = {f < 0} 
satisfy the requirements of the second part of Theorem 2.45, and the second part of 
the proof of the theorem shows that the decomposition UW = + — LL” is obtained from 
any such decomposition of Q. For real scalars this also gives a second proof of the first 
assertion: 


viF)=vir)+v(F)= [fits qu = f Iflay. 


Example 2.48. If us is a K-valued measure, then p is absolutely continuous with respect 
to its variation |u|. By the Radon—Nikodym theorem, there exists h € L'(Q,|u|) such 
that 


MF) = [adyl, Fer. 
F 
By the result of Example 2.47, 
mF) =f inldu, Fe, 


so |h| = 1 u-almost everywhere. 
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2.4.c Integration with Respect to K-Valued Measures 


A measurable function f is said to be integrable with respect to a K-valued measure 
L if it is integrable with respect to |u|. The function f is integrable with respect to a 
real measure pt if and only if it is integrable with respect to the measures u* and LL”, 
where pp = wt — uw” is the Jordan decomposition, and f is integrable with respect to a 
complex measure y if and only if f is integrable with respect to the real and imaginary 
parts of LU. 

The integral of an integrable function f with respect to a real measure jl is defined by 


[fou [rant — frau, 


and the integral of an integrable function f with respect to a complex measure by 


[feu = [ faReu +i f faimy. 


Proposition 2.49. If f is integrable with respect to a K-valued measure [, then 


| | aul < f ifldlut. 


Proof First let f = yy Cyl, be a simple function, with the sets F, € F disjoint. 
Then 


frou = [2 on 


The general case follows from this by observing that the simple functions are dense 
in L'(Q, |u|) and that f, + f in L'(Q, |u|) implies fo |fn— f|dv > 0 for each of the 
measures v € {Rew,Imu,ut, pf. 


N N 
< Yi lenll(Fal < Yo lenllel(Fa) = f Usldlat 
n=1 n=1 


A more elegant, but less elementary, alternative definition of the integral f, fdu can 
be given with the help of the Radon—-Nikodym theorem. Indeed, defining fo fdu as 
above, by the result of Example 2.48 for functions f € L'(Q,|U|) we have the identity 


[fou= ff frau, 


where du = hd|u| as in the example (note that fh € L!(Q, |u|) since |h| = 1 p-almost 
everywhere). This identity could be taken as an alternative definition for the integral 


Jof du. 


2.5 Banach Lattices 


Over the real scalar field, all Banach spaces discussed in this chapter are examples of 
Banach lattices, a class of Banach spaces that will be briefly discussed in this section. 
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The main result, Theorem 2.57, shows that any complete norm on a Banach lattice X 
which is monotone with respect to the partial order of X is equivalent to the given norm 
of X. 

Let (S,<) be a partially ordered set and let S’ be a subset of S. An element x € S is 
said to be a lower bound for S' if we have x < x’ for all x’ € S’. Such an element is called 
a greatest lower bound for S’ if y < x holds for every lower bound y for S’. Similarly an 
element x € S is said to be an upper bound for S’ if we have x’ < x for all x’ € S', and 
such an element is called a least upper bound for S' if x < y holds for every upper bound 
y for S’. Greatest lower bounds and least upper bounds, if they exist, are unique. 


Definition 2.50 (Lattices). A partially ordered set (S,<) is called a lattice if every pair 
of elements has a greatest lower bound and a least upper bound. 


The greatest lower bound and the least upper bound of the pair {x,y} C S in a partially 
ordered set S will be denoted by x \ y and x V y, respectively. 


Definition 2.51 (Vector lattices). A vector lattice is a partially ordered real vector space 
(V, <) with the following properties: 


(i) (V,<) is a lattice; 
(ii) for allO <c € R andu,v € V we have u< v > cu < cv; 
(iii) for all u,v,w€ V we haveu<v>u+w<v+w. 


Let (V, <) be a vector lattice. If u,u',v’v’ € V satisfy u << vandu' < vi, thenu+u' < 
v+u' andv+u’ <v+v" Thus, 


lu<v and uw <vJsutu'<v+v 


by transitivity. Also, if wu < v, then —v =u+(—u—v) <v+(—u—v) = —u, and the 
converse inequality is obtained similarly. Thus, 


u<v=(-v) < (-»). (2.18) 
For v € V we define 


vt:=vVv0, vo :=(-v)Vv0, |v} :=vV(-y). 


If0 <c€R, then (cv)* =cv*, and if c € R, then |cv| = |c||v|; the easy proofs are left 
to the reader. Furthermore, from +(u+v) < |u| + |v| it follows that 


luv] <ul + Iv. (2.19) 
The next proposition lists some slightly less trivial identities. 


Proposition 2.52. Let (V,<) be a vector lattice. Then for all u,v,w € V we have: 


(1) (—u) A(-v) = —-(uV v); 
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(2) u+(vVw) = (u+v)V (u+w); 
(3) u+v=uAv+uVy; 

(4) v=vt-v; 

(5) |v| =vttv. 


The representation of v as the difference of two positive elements in (4) is unique; see 
Problem 2.42. 


Proof Letu,v,weV. 


(1): We have u << uV v, so —(uV v) < —u by (2.18). In the same way we obtain 
—(uV v) < —v. If follows that —(uV v) is a lower bound for {—u, —v}. To prove that it is 
the greatest lower bound, we must show that if w < —u and w < —v, then w < —(uV v). 
This follows by noting that u < —w and v < —w, so —w is an upper bound for {u,v} 
and therefore u V v < —w. By (2.18) this implies w < —(uV v) as required. 

(2): We have u+v <u+(vVw) andu+w <ut+(vVw), sou+(vVw) is an upper 
bound for {u+v,u+w}. To prove that is the least upper bound we must show that 
ifu+tv<xandu+w <x, then u+(vVw) <x. But v < x—u and w < x—u imply 
uVv<x—wand therefore u+ (vV w) < x as desired. 

(3): In view of (1) we must show that u++-v+ (—u) A(—v) =uAv. 

We have u+v+(—u) A(—v) <u+v+(—v) =uand similarly u+v+(—u)A(—v) < 
v. It follows that u+v+ (—u) A (—v) < uw is a lower bound for {u,v}. To prove that it 
is the greatest lower bound we must show that if w < u and w <v, thenw Cu+v+ 
(—u) A (—v), or equivalently w—u—v < (—u) A (—v). By (2.18) and (1), this in turn 
is equivalent to uV v < u+v—w. To prove this inequality we note that w < u implies 
0 < u—w and hence v < u+v—w. In the same way we obtain u < u+v—~w, and 
together these inequalities imply uV v < u+-v—w as desired. 

(4): Taking u = 0 in (3) and using (1) we obtain v= 0Av+0Vv=vt—(0A(-v)) = 
-v. 

(5): By (2), |v| = vV (—v) implies |v| — v= 0V (—2v) = (2v)~ = 2v~ It follows that 
ly} =v+2v> =vt—vl +2v> =vt ty. 


yt 


Definition 2.53 (Normed vector lattices). A normed vector lattice is a triple (X, || -||,< 
with the following properties: 


(i) the pair (X, ||- ||) is a real vector space; 
(ii) the pair (X,<) is vector lattice; 
(iii) for all x,y € X we have |x| < |y| = ||x|| < |lyI]. 


In any normed vector lattice we have 


|| Let [| = Ill. (2.20) 
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Moreover, the lattice operations x +> xt, x Hx, XK |x| are continuous. Indeed, by 
(2.19) we have |x| — |y| < |x—y| and therefore, by (2.20), 


ls = Lvl] < |] yl] = lle 


This gives the continuity of x++ |x|. The other two assertions follow from this by noting 
that x* = }(x+|x|) and x~ =xt —x. As a consequence we note that the positive cone 


+= {xeEX:x>0} 
is closed. 
Definition 2.54 (Banach lattices). A Banach lattice is a complete normed vector lattice. 


The spaces co, €?, C(K), and L?(Q) with 1 < co, are Banach lattices with respect 
to their natural pointwise ordering, and M ay isa Rake lattices with respect to the 
ordering given by declaring  < v if and only if the measure v — UU is nonnegative; the 
greatest lower bound and least upper bound of yu and v are given by 


bAV=yu—(u—v)t=v—(v—p)*, 
HWVV=—+(V—p)" =v+(u-v) 


respectively (cf. Problem 2.41); here, (v —1)* is defined as in Theorem 2.45. Alterna- 
tively one may use the analogues for M(Q) of the formulas appearing in Theorem 4.5. 
The Jordan decomposition of a real measure now becomes a special case of Proposition 
2.52(4). 


Definition 2.55. Let V and W be vector lattices. A linear operator T : V — W is said to 
be positivity preserving if v > 0 implies Tv > 0. 


If T : V > W positivity preserving, then 
[Ty] < Tv], ve. (2.21) 


Indeed, from v < |v| we have Tv < T|v| and from 0 < |v| we have 0 = TO < T}pJ. 
Combining these inequalities gives (Tv)* < T|v|. In the same way we see that (Tv)~ < 
T|v|, and the claim follows. 


Theorem 2.56. Let X and Y be Banach lattices. Every positivity preserving linear op- 
erator T : X — Y is bounded. 


Proof Reasoning by contradiction, suppose that T is nee bounded. Then for all n > 1 
there is a norm one vector x, € X such that ||7x,|| >> By (2.20) and (2.21), upon 
replacing x, by |x,| we may assume that x, > 0 for all n > 1. In view of Y,,54 ||xn||/n? < 
co the sum Y),,5. Xn/n” converges in X. For all 1 <n < N we have x,/n? < DN _ 1 %m/m”, 
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and the closedness of the positive cone X* implies that for all n > 1 we have x, / nme 
Line Xm /m?. Hence, for all n > 1, 


1 x, 
cura <r 
n il Xnll pa 


This contradiction completes the proof. 


This theorem has an interesting consequence: 


Theorem 2.57. Any two norms which turn a vector lattice into a Banach lattice are 
equivalent. 


Proof Suppose the vector lattice (X,<) is a Banach lattice with respect to the norms 
||- || and || - ||’ Then by Theorem 2.56 the identity mapping from (X, || - ||) to (X, || - ||’ 
and its inverse are bounded. 


Problems 
2.1 Letl <p<q<o. Show that if ae @?, thena € @” for all p <r <q and 


llallp < lal, lima], = |lallq- 
rq 


Hint: First show that it suffices to consider sequences a = (dy)n>1 Satisfying 
lan| < 1 for alln > 1. 

2.2 Show that co and @? with 1 < p < are separable, but ¢° is not. 

Hint: ° contains an uncountable family (a);<, such that ||a —a“)|| = 1 for all 
i# i’. Prove that a normed space X containing such a sequence is nonseparable. 

2.3 Prove the completeness assertions at the end of Section 2.2.a. 

2.4 Let K be acompact metric space. Our aim is to prove that if (fn)n>1 is a sequence 
in C(K) satisfying f\ (x) > fo(x) >--- > O and lim,-,.. fn(x) = 0 for all x € K, then 
limp. fp = 0 uniformly on K. This result, known as Dini’s theorem, provides one 
of the rare instances where pointwise convergence implies uniform convergence. 


(a) Reasoning by contradiction, show that if the result is false, then there exists 
an € > 0, a sequence (X,)n>1 in K, and an x € K, such that lim,_,.0.X, =x and 
ful%n) > 4€ for all n> 1. 

(b) Using that also f,(x) | 0 as n > © and fn(%n) < fin(%n) when n > m, show 
that this leads to a contradiction. 


2.5 Find a sequence (fn)n>1 in Cy(0,1) such that 0 < fr4i(x) < fn(x) < 1 for all 


x € (0,1) andn > 1 and f,,(x) | 0 for all x € (0,1), but || f,||.. 4 0. Compare this 
with Dini’s theorem (Problem 2.4). 
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2.6 Let K be a compact topological space and X be a Banach space. Prove that the 
space C(K;X) of all continuous functions f : K — X is a Banach space with 
respect to the supremum norm || f||.. := supyex || f(x) ||. 

2.7 Let D be a bounded open subset of R“. By C*(D) we denote the space of func- 
tions f : D > K that are k times continuously differentiable on D and all of 
whose partial derivatives 0% f extend continuously to D for all multi-indices @ = 
(a,..., Qa) € N4 satisfying |or| := a +-+-+ 04 <k. Here, 0% := 07 0--- 004, 
where 0; is the partial derivative in the jth direction. Prove that C*(D) is a Banach 
space with respect to the norm 


Iflla@ *= mar lO" files 


2.8 Consider the vector space P{0, 1] of all polynomials on [0, 1]. 
(a) Show that 


| 


ba 
: nang [Fo ¥ co 
=0 


k: 


N 
+ > Vent” 


n=0 


j2 


) 


(3) 
I: > \. Cop 02! | 
k=1 = 


defines a norm on P[0, 1]. Here, |y| is the greatest integer n < y and [y] is 
the least integer n > y. 

(b) Show that the functions 1,1?,t+,... span a subspace of P[0, 1] which is dense 
with respect to the supremum norm. 

(c) Conclude that two different norms on a normed space may agree on a sub- 
space which is dense with respect to one of these norms. 


2.9 We prove the existence of smooth functions with various properties. 


(a) Show that the function f : R — [0,°0) defined by 


Kaye an x>0, 


0, x <0, 
belongs to C*(R). 
(b) Show that there exists a function f € C?(0,1) such that f > 0 pointwise and 
Io F(x) dx = 1. 


(c) Show that if D C R¢ is open and nonempty, there exists a function f € C?(D) 
such that f > 0 pointwise and f, f(x) dx = 1. 

(d) Show that if f € C2(R%) and g: R¢ > K is continuous, then the convolution 
f *g is smooth and 


O° (f*8) = (O"f) ¥8 


for every multi-index a € N¢, notation is as in Problem 2.7. 
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(e) Show that if K C DC R¢ where K is nonempty and compact and D is open, 
then there exists a function f € C?(D) such that 0 < f < 1 pointwise and 
f=lonk. 

Hint: Let 5 := d(K,CD) and put 


Ki= {xeD: d(x,K) < 35}. Di := {reD: d(x,K) < =5}. 


Apply part (c) to select a nonnegative function f € C2(B(0; 35) satisfying 
f B(0:45) f(x) dx = 1 and apply part (d) to the function 
d(x,CD') 


8) = Ftp) +d kK)’ Ree 


2.10 Let 
Fi={Pec, i: 0<7 <4, 70) =0, fay=1} 
and consider the mapping ® : C[0, 1] > C[0, 1] defined by 
(®(P)))=tf0), 1 € (0,1). 


(a) Show that F is bounded, convex, and closed in C[0, 1]. 
(b) Show that ® maps A into A and satisfies 


IIP(F) — P(g) Ilo < IF 8llee 


forall f,g EF, f¥g. 
(c) Show that ® has no fixed point in F. 


(d) Compare this result with the Banach fixed point theorem. 
2.11 Lettl<p<o. 
(a) Show that for 1 < p < © the space L?(0, 1) is the completion of C[0, 1] with 


respect to the norm 
: 1/p 
Itlo=(f lreoirar) 


(b) Show that C[0,1] can be identified in a natural way with a proper closed 
subspace of L™(0, 1). 

2.12 Prove the assertion in Remark 2.27. Can the o-finiteness condition be omitted? 

2.13 Let (Q,¥,) be a measure space. Show that if f, — f in L*(Q), then there is 
a w-null set N such that limy,-,.. SUP ety |fn(@) — f(@)| = 0. Compare this with 
Corollary 2.21. 

2.14 Show that passing to a subsequence is necessary for Corollary 2.21 to be true. 
Hint: Consider the unit circle T = {z € C: |z| = 1} with its Lebesgue measure. 
Let t, :=Y : 


im=1 jm and consider the indicator functions of the arcs I, := {erm She 
(th ry th+1 jt. 
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2.18 


2.19 
2.20 


2.21 


The Classical Banach Spaces 
Consider the set 
S:={f €L'(0,1): f(t) > 0 for almost all t € (0, 1)}. 
(a) Determine whether S is an open subset of L!(0, 1). 


(b) Characterise the functions belonging to the closure S. 


Consider the set 
S':={f €L'(0,1): f(t) > 0 for almost all t € (0, 1)}. 


(c) Determine whether S’ is a closed subset of L'(0, 1). 


(d) Characterise the functions belonging to the interior of S’. 


Let (Q,.F,“) be a measure space and let Y C ¥ be a sub-o-algebra. Show that 
for all 1 < p < © the set L?(Q,Y) consisting of all f € L?(Q) that are equal 
Lt-almost everywhere to a Y-measurable function is a closed subspace of L?(Q). 
For n € N and j € {0,1,...,2” — 1} we consider the interval Jj, := (4,52) fe 
(0,1). Let 1 < p< and define the operators E,, : L’(0,1) > L?(0,1),n > 1, by 


2n—1 


foE fi Du, aa: f(t)dt, f €L(0,1). 


(a) Show that each E,, is bounded on L?(0,1) with norm ||£Z,,|| = 1. 
(b) Show that for all f € L?(0,1) we have lim,_,.£nf = f in L? (0,1). 


Hint: Consider what happens for functions in the linear span of the set {1 Tin? 
0<j<2"-1,neEN}. 
Using Young’s inequality, show that if f € L?(R“), g € L4(R¢), h € L’(R¢) with 
1 < p,q,r < © such that phages = 2, then 


J fale —yhO)|dedy < [flllelalAl 


Write out a proof of Corollary 2.30. 
Let (Q, F, W) be a finite measure space and let 1 < p < o. Prove that 


lle :=| f Fau|+||r—(f sau)al] 


defines an equivalent norm on L?(Q). Here, || - || is the usual norm on L?(Q). 
Let (Q, F, w) be a measure space and fix 1 < p,q<o. 


(a) Prove that L?(Q)L4(Q) is a Banach space with respect to the norm 


If lle Q)NLA(Q g) = max{ || f|lz0(@); II fll }- 
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(b) Prove that L?(Q) + L14(Q) = {g +h: g € L?(Q),h € L4(Q)} is a Banach 
space with respect to the norm 


Il fll» (a) +24(@) 
= inf{||gll(q) + llAllav@y : f=eth, ge L(Q), hE L(Q)}. 
Hint: For the proof of completeness use Proposition 1.3. 
(c) Prove that if 1 <p<r<q<o, then 
LP(Q)NL1(Q) C L'(Q) C LP(Q) + L4(Q) 


and that the inclusion mappings are continuous. 


Hint: Write f =1Up pen ft gyn. 
(d) Prove that if 1 << p<r<q<«and0< @ <1 is chosen such that 4 = 
ie + 2, then for all f € L?(Q,) NL1(Q, 1) we have 


q 
1-64) ¢4/0 
fll < [lFllp “lAllg- 
Hint: Use Holder’s inequality with suitable exponents. 


2.22 Prove Lusin’s theorem: If D C R?@ is open and bounded, and f : D — K is mea- 
surable, then for every € > 0 there exists a function g € C,(D) such that 


{xe D: f(x) Few} <e. 


Hint: Study the proof of Proposition 2.29. 

2.23 Let (Q,.F,) be a finite measure space, let 1 < p < ¢, and suppose that (fn)n>1 
is a bounded sequence in L?(Q) converging to a measurable function f “-almost 
everywhere. 

(a) Using Fatou’s lemma show that f € L?(Q) and || f||p < liminf,_,.. || fill p- 
Now assume that 1 < p<, 
(b) Show that lim, 2. fn = f in L'(Q). 
Hint: First show that for every € > 0 there exists an r > 0 such that 
sup |frldu <e. 
n>14 {|fal>r} 
(c) Do we have limp. fn = f in L?(Q)? 

2.24 Let (Q,.F,U) be a probability space and let @ : K > R be a convex function. 
Prove Jensen’s inequality: If a function f € L'(Q) has the property that @ 0 f is 
integrable, then 


o( [fan) < [ oorau. 


Hint: A convex function @ is the pointwise supremum of all affine functions 
w(x) =ax+b satisfying yw < @ pointwise. 
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2.25 On R + = (0,0) we consider the Borel measure fu given by 
1 
ma “at, BE A(R,). 
Bt 
(a) Show that for all Be A(R) and s € R, we have 


u(B)=u(sB), w(B) ="(B"'), 


where sB := {st : t € B} and B! := {t-!: t © B}. 
(b) Forhe L!(R4, a) and s € Rx show that 


[nin = [not = [aye 


Fix 1 < p<. For fe L?(R,, 4 +) and g Li(R,,2 ;) we define the multiplicative 
convolution 


fog(t) ae f(t/s)g i teER4. 


(c) Show that the multiplicative convolution is well defined for almost all t € Ry 
and that the following analogue of Young’s inequality holds: 


Ife llr, dt) < [IF lle, (Ry, %) IIsllivce, ia): 
(d) Show that fog =gof. 
2.26 Let k: (0,1) x (0,1) + K be measurable and suppose that 
1 
A:=esssup | |k(x,y)|dx <9, 
ye [0,1] 


B: =esssup f ‘hee y)|dy < oe, 
x€(0,1] 


Let 1 < p< and define, for f € L?(0,1), 


Taf (x) = feos y)dy, x€ (0,1). 


Show that 7, : L?(0, 1) — L?(0, 1) is a well defined linear operator which satisfies 
the so-called Schur estimate 


Tf llp <A?B Pll fp, f €LP(0,1). 


Hint: Use Holder’s inequality. 

2.27 Let (Q,.¥, UW) be a measure space and let X be a Banach space. For 1 < p < © we 
denote by L?(Q;X) the space of all (equivalence classes of) strongly measurable 
functions f : Q — X for which @ + ||f(@)|| belongs to L?(Q). 
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(a) Prove that L?(Q;X) is a Banach space with respect to the norm 


Ifllp = lO IF) roe) 


By L?(Q) @X we denote the vector space obtained as the linear span in L?(Q;X) 
of the set of all functions of the form f @ x (cf. (1.6)) with f € L?(Q) and x € X. 
(b) Show that if 1 < p <, then L?(Q) @X is dense in L?(Q;X). 


2.28 Let (Q,.¥,U) be a measure space and 1 < p < ©. Let T be a bounded operator on 
L?(Q.) and let X be a Banach space. Consider the linear mapping from L?(Q) ®X 
into itself defined by 


(T @1)\(f @x) := (TF) @x 


(a) Show that the operator T ® J is well defined. 
(b) Prove that if T is a positive operator, then T @/ admits a unique extension to 
a bounded operator on L?(Q;X), and that its norm of equals the norm of T. 
2.29 Let (Q,.F,U) and (Q’,.#',u’) be measure spaces and let 1 < p<q<o. 


(a) Show that the identity mapping on linear combinations of functions of the 
form (14 ® 1g)(@,@') := 14(@)1g(@’) extends uniquely to a contraction 
operator from L?(Q;L4(Q’)) into L4(Q’; L?(Q)). 

Hint: Use (1.7). 

(b) Deduce that if (Q,.¥,u) and (Q’, ¥',u’) are o-finite and f : Q x Q' > Kis 

a measurable function, then the continuous Minkowski inequality holds: 


({,(Lytoorranoy)"”)"<(f(f,ieertanan))" 


2.30 Let f € Li, (R¢) be given. 


(a) Show that for all c € K there exists a Borel null set N. C R@ such that for all 
x € CN. we have 


; 1 
a [lf0)-elay=|F(8) -el (2.22) 


(b) Show that there exists a Borel set L C R¢ with |CL| =0 such that (2.22) holds 
allxe€ Landc eK. 
Hint: Consider U,5 1 Nc, with (Cn)n>1 a dense sequence in K. 
A set L with the properties of part (b) is called a Lebesgue set for f. 
2.31 Prove the following one-sided version of the Lebesgue differentiation theorem for 
d=1:If x is a Lebesgue point of f € L},,(R), then 


1 x+h 
lim — \\dy =1 \\d 
ey lf(y) — f(x)| dy im = [LM x)|dy =0. 
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2.33 


2.34 


2.35 


2.36 
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Show that a subset K of co is relatively compact if and only if there is a y € co 
such that for all x € K andn > 1 we have |xp| < |yn|. 
Let 1 < p < c. Show that a bounded subset S of @? is relatively compact if and 
only if a every € > 0 there exists an index N > 1 such that 

sup )° |xn|? <e? 

xES n>N 
Let S be a nonempty set. For 1 < p < ©, let £?(S) be the completion of the space 
of all finitely nonzero functions f : S > K, that is, functions such that f(s) 4 0 
for at most finitely many different s € S$, with the norm 


Ile = (Lier) 


ses 
where the sum extends over the finitely many s € S for which f(s) 4 0. 


(a) Show that £?(S) can be isometrically identified with the space of all count- 
ably nonzero functions f : S > K, that is, functions such that f(s) 4 0 for at 
most countably many different s € S, for which 


ile = (Lire) 


ses 


is finite. How should this sum be interpreted? 
(b) Show that ¢?(S) is a Banach space in a natural way. 


Show that a K-valued measure v is absolutely continuous with respect to a mea- 
sure u if and only if for every € > 0 there exists a 6 > 0 such that |v(F)| <e€ 
whenever F € F satisfies u(F) < 6. 

A K-valued measure LU on a topological space X is said to be regular, respectively 
Radon, if its variation |u| is regular (see Definition E.15), respectively Radon 
(see Definition E.20). Prove that the sets M,(X) and Mp (X) of all K-valued Borel 
measures on X that are regular, respectively Radon, are closed subspaces of M(X). 
A function f : [0,1] > K is said to be absolutely continuous if for every € > 0 
there exists a 6 > 0 such that whenever (a,)_, and (b,)‘_, are finite sequences 
in [0, 1] satisfying Y°,5) (bn — dn) < 6, then 


ru \f(bn) — f(an)| < €. 
It is said to be of bounded variation if 


N 
ar(f:%) = "If (tr) — F(tr-1)| <2 


3 
ll 
= 


where the supremum is taken over all finite partitions 7 = {to,...,t,} of [0, 1]. 
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(a) Show that a function f : [0,1] > K is absolutely continuous and satisfies 
f (0) =0 if and only if there exists a function g € L'(0,1) such that 


f(t) = i ‘e(s)ds, 1 € [0,1], 


and that this function g, if it exists, is unique. 
Hint: For the ‘only if’ part use the Radon—Nikodym theorem. 

(b) Show that the space NBV(0, 1] of functions f : [0,1] — K of bounded varia- 
tion satisfying f(0) = 0 is a Banach space with respect to the norm 


If llwavjo.] = Sipwar(y#): 


(c) Show that the space of all absolutely continuous functions f : [0,1] + K 
satisfying f(0) = 0 is a closed subspace of NBV (0, 1] and that 


Ilfllwav(oa) = WIsllzi(o1), f € NBV(O, 1], 


where g € L!(0, 1) is the function of part (a). 


2.38 The disc algebra A(D) is the closed subspace of the Banach space C(D) consisting 
of those functions that are holomorphic on DD. By the maximum principle, 


Ifl= sup [f(e)L. 


6¢[-7,7] 
(a) Show that for all f € C(D) and zo € D we have 


f (zo) = Bay el 


ni St z—Z0 
(b) Show that a function f € C(D) belongs to A(D) if and only if f(n) = 0 for 
alln € Z\N, where 
x 1 se, 
f(a) := =z | fener dé, neZ. 
2.39 Let Lip(0, 1] be the vector space of all functions f : [0,1] > K for which 


pier 
x—-y 


dz. 


Il fllvipjo,1) = |F(0)|+ sup 
oe 
x#y 


is finite. Show that Lip/0, 1] is a Banach space with respect to the norm || - ||Lipjo,1)- 
2.40 A function f € L},(IR¢) is said to have bounded mean oscillation if 


1 
lavroced) = SUP rp, IFC) ~ave( Na 


is finite, where the supremum is taken over all balls B in R4 |B| is the Lebesgue 
measure of B, and ava(f) := Bi Jp. f(») dy is the average of f on B. 
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(a) Show that if f and g have bounded mean oscillation, then: 


(1) cf has bounded mean oscillation and 


Icflamocr) = lell flamocra) 


(ii) f +g has bounded mean oscillation and 


If + 8lemoce) < \flamoresy + |8lemocee)- 


(b) Show that | f| pmo(r) = 9 if and only if f is almost everywhere constant. 
(c) Show that every f € L*(IR“) has bounded mean oscillation and 


Iflamocre) <2||fll~- 


(d) Show that the unbounded function x +> log |x| has bounded mean oscillation. 
(e) Show that the quotient space BMO(R“) = BMO(R“)/K, obtained as the 
quotient modulo the constant functions of the vector space BMO(R‘) of 
functions with bounded mean oscillation, is a Banach space in a natural way. 


Show that in a vector lattice (V,<), the greatest lower bound and the least upper 
bound of two elements u,v € V satisfy uAv =u—(u—v)* =v—(v—u)t and 
uVv=u+(v—u)t =v+(u-v)* 
Let V be a vector lattice. 
(a) Show that for all v € V we have vt Av~ =O and vt Vv~ = |yJ. 
(b) Show that if v,w,w’ € V satisfy v = w—w’ with w > 0 and w’ > 0, then 
w>vt andw’ Sv. 
(c) Show that if v,w,w’ € V satisfy v= w—w’ with w > 0, wv’ > 0, andw lw’, 
then w=v* andw/ =v. 
Prove that if X is a normed vector lattice, the lattice operations (x,y) > x/Ay and 
(x,y) + x Vy are continuous from X x X to X. 
Provide the missing details to the proof, outlines at the end of Section 2.5, that the 
spaces M(Q.) studied in Section 2.4 are Banach lattices. 


3 
Hilbert Spaces 


Arguably the most important class of Banach spaces is the class of Hilbert spaces. These 
spaces play a central role in the theory and in various areas of applications, some of 
which will be discussed in later chapters. The present introductory chapter develops the 
basic geometric properties of Hilbert spaces arising from the presence of an inner prod- 
uct generating the norm, such as the orthogonal complementation of closed subspaces, 
the existence of orthonormal bases, and the selfduality of Hilbert spaces embodied by 
the Riesz representation theorem. 


3.1 Hilbert Spaces 


Let V be a vector space. A mapping @ : V x V —> Kis called sesquilinear if it is linear 
in the first variable and conjugate-linear in the second variable, that is, 


o(v+v,w) = o(v,w)+o(v,w), O(cv,w) =ce(v,w), (3.1) 
o(v,wtw') = o(v,w)+0(,W), o(,cw) =ed(v,w), (3.2) 


for all c € K and v, v',w, wv’ € V. The complex conjugation in (3.2) is of course redundant 
when the scalar field is real and sesquilinearity reduces to bilinearity in that case. 


Definition 3.1 (Inner products). An inner product space is a pair (H,(-|-)), where H 
is a vector space and (-|-) is an inner product on H x H, that is, a sesquilinear mapping 
from H x H to K with the following properties: 


(i) (x|x) > 0 for all x € H and (x|x) =O0 => x =0; 
Gi) (xly) = (y|x) for all x,y € H. 


The conjugation bar in (ii) is again redundant when the scalar field is real. If (ii) holds, 
then (3.1) implies (3.2). 
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It will be used frequently without further comment that 
if (x|y) =0 for ally € H, thenx =0. 


Indeed, the hypothesis implies that (x|x) = 0, and then x = 0 by the definition of an 
inner product. 
When the inner product (-|-) is understood we simply write H instead of (H, (-|-)). 


Example 3.2. Here are some examples of inner products: 


(i) on K@ an inner product is given by (x|y) = Y4_ | xnYns 
(ii) on @ an inner product is given by (a|b) = D5 dndn; 
(iii) on L?(Q, W) an inner product is given by (f|g) = fo fZdu. 


In order to turn inner product spaces into 
normed vector spaces we need the following in- 
equality. Its finite-dimensional version has al- 
ready been used in various places in Chapters | 
and 2. 


Proposition 3.3 (Cauchy—Schwarz inequality). 
Let H be a vector space and consider a sesquilin- 
ear mapping (-|-): H x H > K with the following 
properties: 


(i) (x|x) > 0 for all x € H; 
(ii) (xly) = (y|x) for all x,y € H. 

Then for all x,y € H we have 
I(xly)? < (lx) Oly). 


Proof We may assume that y ¥ 0, since otherwise the inequality trivially holds. Fix a 
scalar c € K. Then 


David Hilbert, 1862-1943 


) — (eylx) + (cyley) 
= (x|x) — e(xly) — e(ly) + lel? ly) 


= (x|x) —2Re(c(a]y)) + |el2(yy). 


0 < (x—cy|x — cy) = (x|x) — (aley 


The choice c = (x|y)/(y|y) results in the inequality 


lly)? (Gly)? 


(Gly) © Oly) = Gh) 


I(aly)|? 


0 < (a|x) —2Re oi 


Multiplying with (y|y) gives the desired result. 
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This result is valid without assuming the nondegeneracy assumption in the second 
part of the defining property (i) of an inner product. 


Proposition 3.4. Every inner product space H can be made into a normed space by 


defining 

\|x|] = (x[x)/2, x eH. 
Proof We must check that || - || defines a norm on H. It is immediate that ||x|| = 0 
implies x = 0 and that ||cx|| = |c|||x||. The triangle inequality follows from the Cauchy— 


Schwarz inequality: 


2 2 2 2 2: 2: 
lle + yl” = Hall + 2Re(aly) + [Iv < [ell +21 lloll + ly? = (la + Il” 


Henceforth it is understood that inner product spaces are always endowed with this 
norm. 
As a corollary to the Cauchy—Schwarz inequality we record: 


Corollary 3.5. Every inner product (x|y) is jointly continuous as a function of x and y. 


Proof It suffices to show that if x, > x and y, — y, then (xn |yn) — (x|y). We have 


|(%nl¥n) — CY) < [@nlyn) — Gnly)| +1 Qnly) — Oly) 
= |Ornl¥n —y)| + 1@n — ly) < [eal — yl] + [Pee — alll. 


Since convergent sequences are bounded, the number M := sup,,5 ||Xn|| is finite, and 
we find 


| (%nl¥n) — (ly) | < M]lyn — yl] + [len — IIIT. 


Both terms on the right-hand side tend to 0 as n > »., 


Proposition 3.6 (Parallelogram identity). In every inner product space H the parallel- 
ogram identity holds: for all x,y € H we have 


2|lx||? + 2|lylI? = Ile + yl? + [lx—yIl?. 


Conversely, if X is a normed space with the property that the parallelogram identity 
holds for all x,y € X, then X is an inner product space, that is, there exists an inner 
product on X generating the norm of X. 


In what follows we only need the first part of the proposition. Its converse is included 
for reasons of completeness and can be safely skipped upon first reading. 

The inequality ‘>’ admits L?-versions, known as Clarkson’s inequalities (see Prob- 
lem 5.26). 
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Proof The proof of the first part is routine: 


llxt+yl|? + |x yl]? = + ylx+y) +e —ylx—y) = 2(a]x) + 20Ly) = 2x1]? + 2ILyI]?. 


The proof of the second assertion is quite involved and relies on finding a formula for 
the inner product in terms of the norm of X. To get an idea what this formula might look 
like we first assume there is such an inner product and denote it by (-|-). Then, using 


that (y|x) = (x|y), 
IIx + yl? = (x +ylx+y) = (xx) + (aly) + Gx) + Oly) = Ila]? +2Re(aly) + [ly]? 


and similarly ||x — ||? = |||]? — 2Re(x|y) + ||y||?. Combining the two identities, we get 


1 
Re(xly) = 3 (Ile +9ll?— |lx—-yI?). G.3) 


If the scalar field K is real, then Re(x|y) = (x|y) and the above identity expresses the 
inner product in terms of the norm. If the scalar field K is complex, by the previous 
identity we obtain 


Im(ax\y) = Re(aliy) = 5 ([lx+ayI?— eal). 
This leads to the identity 
(x|y) = Re(aly) + im(x|y) 
= 5 (ll+yI = Ie—yIP + alle +a? ix). 


In an arbitrary normed space we could now try to define an inner product by (3.3) if 
K=R, respectively by (3.4) if K = C, but this does not always give a norm. If, however, 
the parallelogram identity holds, then it does. Let us check this for the case K = R. First, 


(3.4) 


1 
(xlx) = 3 (lx +al?? - |lx—x1) = [xl|? > 0 


and (x|x) = 0 implies ||x|| = 0 and hence x = 0. Second, the identity (x|y) = (y|x) is 
immediate. Third, 


(xt2p) = 2(Ie+x +y1?— le +x! YIP) 
= 5 (2lle|? +2’ +y I? Ix @ +9) IP) 
5 (2ilel? +2’ yl? - Ix’ -y)IP) 
= 5 (lv +91? - Ile -9I?) 


5 (ilx—@ +9)? lx @” -y)P) 
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= 2(x'|y) + (x-2'|y). 
This proves that 
(x+x'/ly) — (e-2'ly) = 20')y). (3.5) 


Taking x = x’ in (3.5) we obtain (2x|y) = 2(x|y). Now let xo,x1 € X be arbitrary. Apply- 
ing (3.5) tox = 5 (x0 —x,) andx’ = 5 (x0 +x) gives 


(aly) - (bv) =2(5 


which, in view of the earlier identities, simplifies to 


(xo +x1)|y), 


(xoly) + (ily) = (%0 +19). 


This gives additivity in the first coordinate. Additivity in the second coordinate is proved 
similarly. Using this inductively, for positive integers k we obtain 


(kaly) = ((K— 1)x+aly) = ((&— Daly) + Gly) 
= ((k—2)x+-xly) + ly) = (A 2)aly) + 2(aly) 
== (rly) + (&- 1) ly) = ky). 


Applying this to k~!x we also find k~!(x|y) = (k~!x|y). Next let g = m/n with m,n 
positive integers. By what we just proved, 


(qx|y) = m(n7!x\y) = mn™! (x|y) = g(xly). 


This proves homogeneity with respect to multiplication with the positive rationals. For 
such rationals we also have (—qx|y) = —(qx|y) = —gq(x|y), and therefore homogene- 
ity holds for all rationals. Finally, by the continuity of the norm || - ||, the mapping 
q +> (qx\y) is continuous. The mapping g +> q(x|y) is continuous for trivial reasons. 
Therefore, the identity (cx|y) = c(x|y) for arbitrary c € R follows by approximation by 
rationals. 

This completes the proof that for K = R, the formula for (-|-) in (3.3) indeed defines 
an inner product. We have already seen that (x|x) = ||x||2, so this inner product generates 
the norm of X. 

For K = C it can be verified in a similar manner that the formula for (-|-) in (3.4) 
defines an inner product and that it generates the norm of X. 


Definition 3.7 (Hilbert spaces). A Hilbert space is a complete inner product space. 
Thus, by definition, every Hilbert space is a Banach space. 


Example 3.8. By the completeness results in the preceding chapter, all three spaces K%, 
@, L?(Q,) featuring in Example 3.2 are Hilbert spaces. 


92 Hilbert Spaces 
Further examples of Hilbert spaces will be given in the problems section. 


Proposition 3.9 (Completion). Let H be an inner product space. On its completion H 
as a normed space, a well defined inner product is obtained by setting 


(x|y) = lim (tnl¥n), X,y € H, 


whenever Xn, Yn € A satisfy x = Vimy 0X, and y = limy +00 Yn. The norm associated with 
this inner product coincides with the norm of H obtained by completing H. 


Proof The proof relies on a few routine verifications, some of which are left as an 
exercise to the reader. 

First of all, it easily follows from Corollary 3.5 that (x|y) is independent of the ap- 
proximating sequences and agrees with their inner product in H when x,y € H. To see 
that (x|y) is indeed an inner product, suppose that (x|x) = 0 and let x = lim,-,..%, with 
each x, € H. Then limy_,.0(%n|%n) = 0 and therefore lim,_,..%, = 0 which, by the con- 
struction of the completion, means that the Cauchy sequence (x,),>1 defines the zero 
element of H. It follows that x = lim,_,..X%, = 0 in H. The remaining properties of an 
inner product follow trivially by limiting arguments. 

Finally, the norm of H agrees with the norm generated by its inner product. This 
follows again by approximation: if x = lim,_,..X, with each x, € H, then for the norm 
|| - || obtained via the completion procedure in the proof of Theorem 1.5 we have 


2_ |}; 2_ 4 
[[a||° = lim |]xn|[° = lim Onn) = Ol), 


the first step being true from the definition of this norm, the second because the norm 
of H is generated by its inner product, and the third because the inner product (-|-) is 
jointly continuous. 


3.2 Orthogonal Complements 
Throughout this section we fix a Hilbert space H. 


Definition 3.10 (Orthogonality). The elements x,x’ € H are said to be orthogonal, no- 
tation 


i 
LX, 


if (x|x’) = 0. Two subsets A and B of H are called orthogonal if a L b for all a € A and 
DEB. 


Orthogonal elements x | x’ satisfy the Pythagorean identity 


llxtac |? = Ila? + [ho 
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as is seen by expanding the square norms in terms of inner products. 


Definition 3.11 (Orthogonal complement). The orthogonal complement of a subset A 
of H is the set 


A+ :={xe€H: xLa forall ac A}. 


The orthogonal complement A+ of a subset A is a closed subspace of H. Indeed, it 
is trivially checked that A+ is a vector space. To prove its closedness, let x, — x in 
H with x, € A+ Then, by the continuity of the inner product, for all a € A we obtain 
(x|a) = limy—00(Xn|a) = 0. 

The most important result on orthogonality is certainly the fact that every closed 
subspace Y of a Hilbert space is orthogonally complemented by Y +. This is the content 
of Theorem 3.13 below. For its proof we need the following approximation theorem for 
convex closed sets in Hilbert space. Recall that a subset C of a vector space is called 
convex if for all x9,x1 € C we have (1—A)xp +Ax, € C forallO<A <1. 


Theorem 3.12 (Best approximation). Let C be a nonempty convex closed subset of H. 
Then for all x € H there exists a unique c € C that minimises the distance from x to the 
points of C: 


I|x— el] = aun 
Proof Let (yn)n>1 be a sequence in C such that 
lim [pe—yul| = inf lle yl] =. 


We claim that this sequence is Cauchy. By the parallelogram identity of Proposition 3.6, 
applied to the vectors x — y,, and x — yy, 


[Yn —Ymll* + ||24— (in + Ym) ||? = 21x — Yall? + 2I |x — yall”. 
As m,n — , the right-hand side tends to 2D? +2D2 = 4D, whereas from 5 (Ym +yn) € 
C (by convexity) it follows that 
2x — (Yn +ym)|l? =Allx— (Yn +ym)||? 2 4D”. 
It follows that 
lim sup ||¥n — ym||? < 4D? — 4D? = 0. 


m,n—oo 


The limit superior is also nonnegative, and therefore it equals 0. This proves the claim. 
Since H is complete we have lim,_,..¥, = c for some c € H, and since C is closed we 


have c € C. Now ||x—c|| = limp—~ ||x — yn|| = D, so c minimises the distance to x. 


Both the existence and uniqueness parts of this theorem fail for general Banach 
spaces; see Problems 3.5 and 3.6. 
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Theorem 3.13 (Closed subspaces are orthogonally complemented). If Y is a closed 
linear subspace of H, then we have an orthogonal direct sum decomposition 


H=YoYt, 
that is, we have YNY+ = {0}, Y+Yt=H,andY Ly 


Proof We have already seen that Y+ is a closed subspace. If y€ YNY+, then y | y, so 
(y|y) =0 and y = 0. It remains to show that Y+Y+ =H. 

Let x € H be arbitrary and fixed. We must show that x ¢ Y+Y+. Let zy: H —~Y 
denote the mapping arising from Theorem 3.12, that is, zyx is the unique element of Y 
minimising the distance to x: 


— fyx|| = min ||x —yI|. 
[aes = ae 
Set yo := Myx and y; :=x— yo. Then yo € Y, and for all y € Y we have 

llyill = lx yoll < llx— Qo —y) Il = lly + @— yo) Il = lly +yiI- (3.6) 

SS 
eY 
We claim that (3.6) implies y; € Y+. To see this, fix a nonzero y € Y. For any c € K we 
have, by (3.6), 
llyill? < ley +y1||? = lel? lly? +2Re(eyly1) + Ilya: 

Taking c = —(yly1)/|ly||*, this gives 
lol? lob? 
IIy|I? II 


which is only possible if (y:|y) = 0. Since 0 4 y € Y was arbitrary, this shows that 
y1 € Y+. This proves the claim. It follows that x = yo +y; belongs to Y +¥+ 


Definition 3.14 (Orthogonal projection onto a closed subspace). The projection zy onto 
Y along Y+ given by 


ar(yty)i=y, yeY,y ey, 
is called the orthogonal projection onto Y. 


The Pythagorean inequality implies that ||zy || < 1 (with equality if Y 4 {0}). As was 
shown in the course of the proof of Theorem 3.13, the projection zy coincides with 
the mapping arising from Theorem 3.12. For general closed convex sets C, the distance 
minimising mapping of Theorem 3.12 is generally nonlinear. 

As acorollary to Theorem 3.13, for closed subspaces Y we get (Y+)+ = Y, and more 
generally for any subspace Y we get (since Yt = (Y)+) 


(yrs 
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By way of example, if (,)*_, is an orthonormal sequence, the orthogonal projection 
ay in H onto the span of hj,..., hy is given by 


N 
tyx = ¥° (x\hn)hn, 


n=1 


as the reader will have no difficulty checking. 


3.3 The Riesz Representation Theorem 


Let H be a Hilbert space. As a first application 
of Theorem 3.13 we prove the Riesz represen- 
tation theorem, which sets up a conjugate-linear 
identification of a Hilbert space H and its dual 
H* = 2(H,K). 

By the Cauchy—Schwarz inequality, every h € 
Hi defines a bounded functional yw, : H — K by 
taking inner products: 


als) := (xlh). 
Boundedness is evident from | w(x) | = |(x|2)| < <4 
\|x||||4||, which shows that ||y,|| < ||A||. From Frigyes Riesz, 1880-1956 


vu(h) = (hh) = ||h|? we see that also || yil| > 
||A||, so that || Wal] = ||Al]- 
All bounded functionals @ : H — K arise in this way: 


Theorem 3.15 (Riesz representation theorem). Jf @ : H — K is a bounded functional, 
there exists a unique element h € H such that ¢ = Wy, that is, 


o(g)=(slh), g EH. 


Proof If (x) =0 for all x € H, we take h = 0. Henceforth we shall assume that 
£0. Then (N(@))+ ¢ {0} by Theorem 3.13, and we can choose a norm one vector 
yo € (N(@))+. Fix an arbitrary x € H. With c := $(x)/@(yo) we have $(x—cyo) = 
$ (x) —c@(yo) = 0. This means that x —cyo € N(@), so x—cyo L yo and 


(x) = cb(yo) = (v0) (cyolyo) = (v0) (xly0) = (x10 (V0) yo). 


This proves that @ = yj, with h := @(yo)yo. 
To prove uniqueness, suppose that @ = YW, = Wy for h,h' € H. Then ||h—h' 
(h—h'|h—W') = wy (h—h') — Wy (h—h’) = 0 and therefore h’ = h. 


I? = 
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Proposition 3.16. For every bounded sequence (Xn)n>1 in H there exist a subsequence 
(Xn, )k1 and an x € H such that 

lim (A|xn,) = (h|x), he H. 

k- 00 
Proof Let Ho denote the closed linear span of (x,)n>1 and let Ho be its orthogonal 
complement, so that we have the orthogonal decomposition H = Ho ® Ho. 


Step 1 — We begin by proving that there exist a subsequence (xy, )x>1 and an x € Ho 
such that 


lim (A|xn,) = (h|x), A € Ho. 


k00 


Let (y;) j>1 be a sequence whose linear span Y is dense in Ho. The (xn)n>1 Sequence 
has a subsequence which, after relabelling, we may call (06) Jas 1, such that the limit 
O(y1) = Limys20(91|x4)) exists. This sequence has a further subsequence which, af- 
ter relabelling, we may call (x?) n>1, such that the limit @(y2) := lim) s<0(y2 |x) ex- 
ists. Note that we also have $(y1) = limy+.0(y1 x??), Continuing inductively, for ev- 
ery k > 1 we obtain a subsequence (Jas with the property that the limit @(y;) = 
limy—y00(y iP?) exists for j = 1,...,k. The ‘diagonal subsequence’ (ast has the 
property that the limit @(y;) = limp5.(y xf”) exists for all j > 1. By linearity, the 
limit @(y) := Him) yoo(y|x0””) exists for all y € Y. Clearly y++ @(y) is linear and |@(y)| < 
M|ly||, where M := sup, ||Xn||. This shows that @ is bounded as a mapping from Y to 
K. Since Y is dense in Ho, Proposition 1.18 implies that @ has a unique bounded ex- 
tension of the same norm to all of Ho. By the Riesz representation theorem, applied to 
the Hilbert space Hp, there exists an x € Ho such that @(h) = (h|x) for all h € Ho. This 
element has the required properties: for all h € Y we have 


im (h|xh”) = 6(h) = (hx). 
For general h € Hp the same identity holds by an approximation argument, using that Y 


is dense in Ho. 


Step 2 — We now show that 


lim (h|x) = (nlx), he H, 
n—-oo 
where eae and x € Ho are as in Step 1. To this end let h € H be arbitrary and 
write h = ho + hy along the orthogonal decomposition H = Hy © Hp. Step 1 gives 
limp s20(ho|x¥”) = 0 = (ho|x) for ho € Ho. Trivially, limy_s.0(he|x¥”) = 0 = (he |x) for 
all hg € Ho-. This concludes the proof. 


The argument is Step 1 is known as a diagonal argument. 
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We have the following simple criterion for the convergence of a series whose terms are 
pairwise orthogonal. 


Proposition 3.17. Let (%n)n>1 be a sequence in H with xm L x, for all m 4 n. The 
following assertions are equivalent: 


(1) Mansi Xn converges in H; 
(2) Last llanll? <o. 


In this situation, 


2 
=P |eall?. 


n>1 


|D* 
n>1 


Proof Let us first note that if J is any finite set of positive integers, then 


[Zel= (Lele) = LE tobe = Dok 


melnel nel 
since (Xm|xn) =O ifm An. 


()=(2): Tf Y,51%n converges in H, say to x, then x = limy_;. Ke in H and 
therefore 


N 2 N 
2 o »} . »} 2: 
[| Woe mt Novem Oo |[*al| 


It follows that Y 51 ||xn||7 = ||x||?. This also proves the final identity in the statement of 
the proposition, since by definition 51 Xn = x. 


(2)=(1): Suppose, conversely, that D5 ||Xn||? <e. Then 


im |] SP — Yan = tim |] Yall = tim I[xnl|2 =0. 
“Nom Mt nel NM ML RS nied 


It follows that (Y_, x,)v>1 is Cauchy, and hence convergent. 
As a special case of Proposition 3.17 we record: 


Proposition 3.18 (Parseval). Let (An)n>1 be an orthonormal sequence in H. For a 
scalar sequence (Cn)n>1, the following assertions are equivalent: 


(1) Yns1 Cnn converges in H; 
(2) Yaz eal <0, 
In this situation the Parseval identity holds: 


2 
|Z catell’ = lea? 
n>1 


n>1 
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Definition 3.19 (Orthonormal system, orthonormal basis). Let J be a nonempty set. A 
family (h;)icr in H is called an orthonormal system if 


1 if i=j 
(hi|hj) =5;={ 


0 otherwise, 


for all i, j € J with i ¥ j. In the case of a countable set /, an orthonormal system (hj) jer 
is called an orthonormal basis if every x € H can be represented as a convergent series 


x= y cihj (3.7) 
iel 
for suitable coefficients c; € K. 


The convergence of the sum (3.7) is understood in the following sense. We pick 
an enumeration of the index set, say J = (in)n>1, and ask for convergence of the sum 
Yn>1 Ci, hi. By Parseval’s theorem, this sum converges if and only if DY)31 |ci,|? < °°. 
Thus, whether or not the sum converges is independent of the enumeration chosen. To 
see that the sum x := )',51 ci, i, is independent of the enumeration, let (jim)m>1 be 
another enumeration of J and set y := Vn>1 C;,,4;,,- Then for any i € I, say i= in = jm, 


(x|hi) = (|hi,) = Ci, = Ci = Cj, = Aj) = |i): 


Since both x and y belong to the closed linear span of the family (/;)je, this implies 
x = y. This argument also shows that the coefficients c; in (3.7) are uniquely determined 
by x and given by cj = (x|hi). 


Example 3.20. The standard unit vectors (0,...,0,1,0,...), 7 > 1 (with the 1 on the 
nth place), form an orthonormal basis for 2 

In Section 3.5 we prove that the trigonometric system is an orthonormal basis for 
L?(0, 1) and the (suitably normalised) Hermite polynomials form an orthonormal basis 
for L’(R, y), where y is the standard Gaussian measure on R. 


Theorem 3.21 (Orthonormal bases). Let (in)n>1 be an orthonormal sequence in H. 
The following assertions are equivalent: 


(1) (An)nz1 is a basis; 
(2) (An)n>1 has dense linear span; 
(3) ifx EH satisfies (x|h,) =0 for alln > 1, then x =0. 


Proof The equivalence (2)<>(3) is immediate from the fact that a subspace is dense if 
and only if its orthogonal complement is trivial. 


(1)=(2): This implication is trivial, because by assumption every x € H can be ap- 
proximated by the partial sums of a series representation of the form )',51 Cn/tn, and 
these partial sums belong to the linear span of (Mn) n>1. 
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(2)=(1): Suppose the linear span of (/,)n>1 is dense in H and fix an arbitrary x € H. 
We must prove that x admits a representation as a convergent sum )',51 Cnn. 
For each N > 1 the mapping 


N 
Pyx = YP (x|n)htn 
n=1 
is a projection that maps H onto the span Hy of (h,)*_,. If x L Hy, then (x|h_) = 0 for 
n=1,...,N and therefore Pyx = 0. It follows that the projection Py is orthogonal. This 
implies that Py is contractive. Therefore, with cy := (x|hn), 


N 

Y Neal? = ||Pyxll? < lla’. 

n=1 
This being true for all N > 1, it follows that Y,51 len|? < co and therefore the sum y := 
Yn>1 Cnn is convergent in H by Proposition 3.18. For all n > 1 we have (y|Mn) = cn = 
(x|h,), and since the span of (hn )n>1 is dense in H this implies x = y = Y,5) Cntn. 


When (x,)n>1 is a (finite or infinite) linearly independent sequence in H, we may 
construct an orthonormal sequence (/,)n>1 with the property that 


span{x,,...,x,}=span{hy,...,a}, k>1, 


as follows. Set Ay := x1/||x;||. Suppose the orthonormal vectors /1,...,4, have been 
chosen subject to the condition that H; := span{x,,...,x;} equals span{M,...,h;} for 
all j = 1,...,k. By linear independence, the subspace Hy, := span{x),...,xx41} has 
dimension k + 1. The orthogonal complement in H;,,, of the k-dimensional subspace 
Hi, has dimension 1, and therefore we may select a norm one vector hy.) € A+, or- 
thogonal to Hy. Then Hy4; = span{h,...,h¢41} as desired. This procedure is called 
Gram-Schmiadt orthogonalisation. 


Theorem 3.22 (Orthonormal bases and separability). A Hilbert space has an orthonor- 
mal basis if and only if it is separable. 


Proof ‘If’: By assumption we can find a (finite or infinite) sequence (x,),>1 with 
dense span in H. By passing to a subsequence, we may assume that the elements of the 
sequence are linearly independent. By Gram—Schmidt orthogonalisation we construct 
an orthonormal sequence (/,),>1 with the property that for all k > 1 the linear span of 
{i,...,h,} equals the linear span of {x;,...,x,}. Since the linear span of (%,)n>1 is 
dense in H, the sequence (,),>1 is an orthonormal basis of H by Theorem 3.21. 


‘Only if’: If (4n)n>1 is an orthonormal basis of H, its linear span is dense. 


Corollary 3.23. Any two infinite-dimensional separable Hilbert spaces are isometri- 
cally isomorphic. 
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Proof Suppose that the Hilbert spaces Hj and H> are separable and pick orthonormal 
bases (31 and (A?) st. The operator U sending A) to A?) for each n > | is 
isometric by the Parseval identity and has dense range. In particular U is injective. By 


Proposition 1.21, U has also closed range and therefore U is surjective. 


Definition 3.24 (Maximal orthonormal systems). A maximal orthonormal system is a 
family (h;)icr, where J is a nonempty set and: 


(i) (hi|h;) = Oi for all i,j el; 
di) if h L h; for alli € J, then h = 0. 


In a separable Hilbert space, every maximal orthonormal system is countable and can 
therefore be relabelled into an orthonormal basis. 


Theorem 3.25 (Maximal orthonormal systems). Every nonzero Hilbert space has a 
maximal orthonormal system. 


Proof Partially order the set of all orthonormal systems in the nonzero Hilbert space 
H by set inclusion. By Zorn’s lemma (Theorem A.3) this set has a maximal element, 
say (hj)icr, where J is some index set. It is clear that condition (i) in the above definition 
holds. If there were a nonzero h € H perpendicular to each hj, after normalising h to unit 
length we obtain a new orthonormal system properly containing (;)ic7, contradicting 
the maximality of (;)ice7. Therefore (ii) also holds. 


3.5 Examples 


In this final section we present two nontrivial examples of orthonormal bases. 


3.5.a The Trigonometric System 


In this example T denotes the unit circle in the complex plane, parametrised by the 
interval [7,7] and equipped with the normalised Lebesgue measure d@/27. We shall 
prove that the functions 


én(@) :=exp(inO), @€ [—2,2],n€Z, 


form an orthonormal basis for L7(T). 
That (€,),ez is an orthonormal sequence in L?(T) is evident from 


ey cee tn ee ee e 
@lej= wa |, 2xP(ii®)explik®) a0 = on | exp k)O) dO = 5p. 


To prove that (€,),¢z is an orthonormal basis, by Theorem 3.21 it remains to be proved 
that the trigonometric polynomials, i.e., the functions of the form yy Nn Cnen, are dense 
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in L?(T). This can be deduced from the Stone—Weierstrass theorem (see Problem 3.11), 
but we prefer the following argument from Fourier Analysis which gives explicit ap- 
proximants and some error bounds. 


Definition 3.26 (Fourier coefficients). The Fourier coefficients of a function f € L'(T) 
are defined as 


Fn) = (len) = aa | _F(@)exp(—in0) ao, néZ. 


Theorem 3.27. For all f € C(T) we have 


tim 7-2 y FRex 


n=0 k=—n 


LO 


Proof Fix f € C(T) with || f||.. = 1. We have 


an 1 cL 1 t 
F(n) exp(ind) = — i exp(in(@ — 6) f(o)do = — iE exp(ino) f(@ —)do 
20 Jan 20 Jon 
and therefore 
1 N-1 n 
= “oo ye fl f(k)exp(iké) -=[" Ky(o o)do, 
= O0k=-n 
where the Fejér kernel Ky is defined by 
1 N-1 n 1 Ine 
=<. rps 3 exp(ik@ ) ene), (3.8) 
I Ok=—n ~N sim *(50) 


the right-hand side identity is readily deduced from the geometric series. In view of the 
fact that +. [*, Ky(@)d@ = 1, we have 


|fv(@ 8)|= Ky( —o0)— f(@))do 
m bel u(s (3.9) 


<5 [ ku(s)|(0-0)- s(6)l40. 


Fix € > 0 and choose 0 < 6 < 7 so small that || f(-— 0) — flo < € for all |o| < 6; this 
is possible since f is uniformly continuous. Then 


6 T 
ag | kore o)-s@)\do< = | Ky(o)do=e — .10) 


and, by (3.8) and the normalisation || f||.. = 1, 


: 2 1 
aa Ih 5S" ONO 9) -F(0) do < SW sint(b8) (3.11) 
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Combining (3.9) with (3.10) and (3.11) we obtain 


1 


2 
a wo <E+——7— 
\|,fiv fll N sin?(45) 


so limsupy_5.. || fiv — f loo < €. Since € > O was arbitrary, this completes the proof. 


This theorem implies that the trigonometric polynomials are dense in C(T). 


Since 


this space is dense in L*(T) by Proposition 2.29, it follows that the trigonometric poly- 


nomials are dense in L?(T). Therefore, Theorem 3.21 implies: 


Theorem 3.28. The trigonometric polynomials form an orthonormal basis in L? (T). 


The theory of orthonormal bases can now be applied. It entails that every function f € 
L?(T) has a unique series representation of the form f = YnczCnen, with convergence 


in L?(T) and coefficients given by cn = (f\eén) = f(n). The resulting expansion 
f= ye f(njen 
neZ 
is called the Fourier series of f. By Parseval’s identity we have 
1 


=z | Wo Pae = ¥ IF) 


neZ, 
and the mapping f > (f(n))nez is an isometry from L7(T) onto (7(Z). 
By translation and scaling, the functions 
é,(@) :=exp(27inO), neZ, 
form an orthonormal basis for L?(0, 1). This can be used to prove: 


Corollary 3.29 (Euler). py oe 


Proof For the function f(@) = 6 in L7(0,1) we have (f|é0) = 4 and (f|é,) = — 


so by Parseval’s identity 


co 


1 
2nin? 


e 1 
bef otao =I? = Elen =) 2b (Ly = t+ bE S, 


neZ n=1 


and the result follows. 
Another proof will be given in Section 14.5.f. 


The system of functions 


1, V2sin(22n@), V2cos(2an8): n=1,2,... 


n=1"? 


(3.12) 
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>@ 


Figure 3.1 The function f, extended from (0,1) to (—2,2) by odd reflections (thick graph), 
and the functions cos(an@/2) for n = 1,2. Notice that J25f(0) cos(znt /2 dé = 0 for all 
n> 1, because f(@) is odd about 0 and cos(an@/2) is even about 0. 


Figure 3.2 Idem, but now with the functions sin(7n@/2) for n = 1,2. This time we have 
[?5f(@) sin(an0 /2)d@ = 0 if n > 1 is odd, because f is odd about +1 and sin(zn@/2) is 
even about +1 for odd n. 


is orthonormal in L?(0, 1) and the trigonometric functions exp(27in®), n € Z, are con- 
tained in their linear span. Hence this system forms an orthonormal basis by Theorem 
3.21. Interestingly, from this we can deduce: 


Theorem 3.30. Each one of the two systems 
V2sin(mn0): n=1,2,... (3.13) 
and 
1,V2cos(an@): n=1,2,... (3.14) 


forms an orthonormal basis for L?(0,1). 
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These bases arise naturally as the eigenvector bases of the Dirichlet and Neumann 
Laplacians in L7(0, 1), respectively (see Example 12.23). 


Proof Givena function f € L?(0, 1), we extend it to an odd function in L?(—1,1). This 
function is extended to a function in L?(—2,2) whose restriction to (0,2) is odd about 
the point 1 and whose restriction to (—2,0) is odd about the point —1. If we expand the 
resulting function against the orthonormal basis for L*(—2,2) obtained by scaling the 
system (3.12), that is, 


sh 53 sy cos rn /2) a Pee 

then, due to the symmetries introduced by the odd reflections, only the coefficients 
corresponding to the system (3.13) with even indices can contribute, but not the ones 
with odd indices; nor do those of (3.14) contribute; see Figures 3.1 and 3.2. If we do 
the same with even extensions, only the coefficients corresponding to the system (3.14) 


sin(zn@ /2), 


with even indices can contribute. Restricting the resulting expansions in L?(—2,2) to 
L?(0, 1), the desired expansions of f € L*(0, 1) in terms of the systems (3.13) and (3.14) 
are obtained. 


3.5.b The Hermite Polynomials 


In this section we prove that the Hermite polynomials form an orthonormal basis for 
L?(R, y), where 7 is the standard Gaussian measure on R. This is the Borel probability 
measure on R which is given, for Borel sets B C R, > 


Bios = ” ) dx, 
The Hermite polynomials will resurface in Chapters 9, 13, and 15 in connection with 
the spectral theorem, the Ornstein—Uhlenbeck semigroup, and second quantisation, re- 
spectively. 
Definition 3.31. For n € N the Hermite polynomials H, : R — R are defined by the 
identity 


i Ss 
H(t,x) := exp(tx— st’) =) Hn), t.xeER. (3.15) 
n=0°"* 


The first five Hermite polynomials are given by 


Ao(x) = 1, 
A(x) =x, 

Hp (x) = -1, 
H3(x) 
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Hy(x) =x* = 6x7 +3. 
By induction one shows that 
|n/2| (—1)* n! 
A, (x) = a N. 
n(x) » OP igo "= 
Proposition 3.32. The Hermite polynomials have the following properties: 


G@) An(—x) = (—1)"An(a); 

(ii) An4o(x) = xAn4i (x) — (n+ 1)A) (x); 
(iii) Hi, (x) = (n+ 1)A,,(x); 
(iv) H, is a monic polynomial of order n. 


Proof Property (i) follows from the identity H(t, —x) = H(—t,x), (ii) from the identity 
aH (e, x) = (x—1t)A(t,x), and (iii) from gH (t,x) = tH (t,x). Assertion (iv) follows from 


(ii) and the fact that Ho(x) = 1. 


Theorem 3.33. The sequence (FaAn \nen forms an orthonormal basis for L?(R,7). 
Proof For all s,t € R we have 


| Ho.9 H (t,x) dy(x) ae 


x*) dx 
2 


Se ee 


= — wees dx (3.16) 


1 © 1, 
= exp(st ex (-5 )a = exp(st). 
Sag enor) | exp(—30”) dy = exp( sn 


Taking derivatives cae at s =t = 0 on both sides of the identity in (3.16) gives 


[Hn(oH x) dy(x) = m!5yn. 


Since m!dnn = Vm!Vn!6nn, this shows that the sequence ( Fattn)nen i is orthonormal 
in L?(R,7). 

It remains to show that the span of the Hermite functions is dense in L7(R,y). A 
quick proof is obtained by making use of the injectivity of the Fourier transform (which 
is an immediate consequence of Theorem 5.21). If f € L7(R, 7) is orthogonal to every 


Hermite polynomial, then it is orthogonal to every polynomial. From this it follows that 
for all z € C, 


= [ Fo) eo dr= pe aes i 


=0m=0 


In particular we have F (it) = 0. From this we infer that the Fourier transform of x > 
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f(xe7 2” vanishes identically. By the injectivity of the Fourier transform, this implies 
that f (x)e-2*" = 0 for almost all x € R, and therefore f(x) =0 for almost all x € R. 


The identity Hy+2(x) = xAn+1(x) — (n+ 1)H,(x) of Proposition 3.32 is an example 
of a so-called three point recurrence relation. As we will see in Section 9.6, orthogonal 
polynomials in L?(R,), with u a finite measure on R, always satisfy a three point 
recurrence relation, and conversely if a sequence of polynomials on R satisfies such a 
relation, then under a mild additional assumption these polynomials are orthogonal on 
L?(R, 1) for a suitable finite Borel measure on R. 

Theorem 3.33 admits an extension to higher dimensions; see Section 15.6. 


3.5.c Tensor Bases 


Let j1; be a finite Borel measure on a compact metric space K; for each j = 1,...,k, 
and let K = K, x--- x K, and uw = fy X--- X Hy be their products. If Cf nt is an 
orthonormal basis for L? (Kj, Ux) for each j = 1,...,k, then the functions 


Fal) = far (1) Fi Oe), 2 € {1,2,.. 76 


form an orthonormal basis for L?(K, 1). Orthonormality being clear, in view of The- 
orem 3.21 it remains to check that the span of the functions f, is dense. This follows 
from the fact that C(K) is dense in L?(K,) by the observation in Remark 2.31 and 
the fact that functions of the form g(x) := g\)(x1)---g (xx) with g) €C(K ;) for all 
j=1,...,k are dense in C(K) by Example 2.10. Since each of the functions oe J) can 
be approximated in L?(K j,M;) by linear combinations of the functions fi ) g can be 
approximated in L7(K, 1) by linear combinations of the functions fy. 

This is a special case of a more general construction ee tensor products of 
Hilbert spaces (see Chapters 14 and 15, in particular (14.3)): if (hj ()) n>1 1S an orthonor- 
mal basis for the Hilbert space H;, j = 1,...,k, then the vectors 


hn (x) = he) @---@h®, ne {1,2,...}, 


form an orthonormal basis for the Hilbert space tensor product H = H, ®---@ Ax. 


Problems 


3.1 Show that equality |(x|y)| = ||x||||y|| for the Cauchy—Schwarz inequality holds if 
and only if x and y are collinear (that is, both belong to some one-dimensional 
subspace). 

3.2 Let (%n)n>1 be a sequence in a Hilbert space H. Suppose that there exists x € H 
such that: 
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(i) (xnly) — (x|y) for all y € H; 
Gi) ||xnll — lal. 
Show that x, > xin H. 

3.3 Provide the missing details in the proof of Proposition 3.9. 

3.4 A Banach space X is called strictly convex if for all norm one vectors x9,x; € X 
with x9 # x; andO0 <A <1 we have ||(1—A)x+Ay]|| < 1. Prove that every Hilbert 
space is strictly convex. 

3.5 Give an example of a nonempty compact convex set C in a two-dimensional Ba- 
nach space X along with a vector x € X such that the set 


€C: ||x—e|| = min||x— \ 
{e€C: |x ell =min|lx—yI 


consists of more than one element. 
3.6 Let 


1 
C= {Fe C[0,1]: f is real-valued, f(0) =0, [ f(t)dt = of. 
0 


(a) Check that C is a closed and convex subset of C[(0, 1]. 
Let g € C[0, 1] be the function defined by g(t) :=r. 


(b) Show that for any f € C we have || f — g|| > 5. 
(c) Show that infyec || f — g|| = 5: 


This shows that C contains no point minimising the distance d(g,C). 
3.7 In this problem we determine some orthogonal complements. 


(a) Let 
Y :={f€L(0,1): f(t) =0 for almost all 1 € (0,4) }. 
Show that Y is a closed subspace of L?(0, 1) and find ¥+. 
(b) Let 
1 
Y:= {fe (0,1): ‘ f(t)dt = of. 
0 


Show that Y is a closed subspace of L(0, 1) and find Y+. 


3.8 Let (4n)n>1 be a finite or infinite orthonormal sequence in a Hilbert space H. 
Prove that for all x € H we have Bessel’s inequality 


Y lata)? < [fol 


n>1 


3.9 Let (e;) ;>1 be the sequence of standard unit vectors in (”, and let F = span{eo)—1 : 
n> 1} and let G= span{e2,_1 + sen n> lh. 


(a) Give explicit expressions for the orthogonal projections Pr and Pg. 
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(b) Show that FG = {0}. 
(c) Show that F +G is dense, but not closed in 2. 


3.10 Prove the identity (3.8). 

3.11 Use the Stone—Weierstrass theorem to prove that the trigonometric polynomials 
are dense in L?(T). 

3.12 Prove the following binomial identity for the Hermite polynomials: For all n € N 
and x,y € R, 


Hy(x+y) = y (jf) 0 


k=0 


3.13. Prove the following formula for the Hermite polynomials: For alln € N andx € R, 


1 ee 1 
Hy(x) == | x+iy)" ex (-5 *) dy. 
BONS ie fy RL ey 
3.14 Define the polynomials Ln, n € N, by the generating function expansion 


exp(x/(1+4)) yf, Gy 
1+t oe 


These are the Laguerre polynomials normalised so as to become monic. 


(a) Compute the polynomials L, for n = 0, 1,2,3. 
(b) Show that the polynomials L, are monic, have degree n, and satisfy the recur- 
rence relation 
Ln42(x) = (x—2n+3)Lngi (x) — (n +1) Ln(x), n EN. 
(c) Prove that the sequence (4,L,,)n>0 is an orthonormal basis for L?(R;,e~* dx). 
3.15 The Hardy space H*(D) is the vector space of all holomorphic functions on D of 
the form Drcn nz” with Yen |enl? < ©. 


(a) Prove that H*(D) is a Hilbert space with respect to the norm 
1/2 
Illia) = (Llenl?) 
neN 
Let the functions e, € L?(T) be defined by e, (0) := exp(in®),n EN, 6 € [-7, 7]. 


(b) Show that for all f =Y,cn ¢nz" € H?(D) the sum f |p := Lnen Cnén converges 
in L?(T) and that the mapping 


fr flr 


sets up an isometric isomorphism from H7(ID) onto the closed subspace of 
L?(T) of all functions whose negative Fourier coefficients vanish. 
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(c) For holomorphic functions f : D — C with power series expansion f(z) = 
nen Cnz" and 0 <r < 1 define 


fe > Cn" En. 
neN 
Show that for f € H?(D) and 0 <r < 1 we have f, € L?(T) and IlFrllzzcr) < 
Ilfla|lz2er)> and show that 


lim fr = f\rin L(T). 


(d) Show that a holomorphic function f : D — C belongs to H?(D) if and only 
if 
sup |[Frlliaqr) <> 
0<r<l1 


and that in this case we have 
WF llz2@y = Wflelleq@y = sup Flac) 
0<r<l1 
(e) Show that all f € H7(D),0 <r <1, and @ € [~7, 7] we have 


f(rexp(i@)) =x [- f\r(n)P-(8 —) dn, 


where the Poisson kernel is given by 


1-r 


aie 1 — 2rcos(@) +r?’ 


Hint: Begin by showing that P.(@) = Y,cz, 7"! exp(in®). 
3.16 We continue our study of the space H?(D). 
(a) Show that for all z) € D the function 


ky (z) = 


belongs to H7(D) and 


1 
IWzolla2(D) = G—|~2)2" 


(b) Show that for all f € H7(D) and zp € D we have 


F(Z0) = (Feo) 


(c) Use parts (a) and (b) to show that if f,, > f in H?(D), then f, + f uniformly 
on every compact subset of D. 
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3.17 The disc algebra A(D) is the closed subspace of the Banach space C(D) consisting 
of those functions that are holomorphic on D (see Problem 2.38). 


(a) Show that A(D) is dense in H? (ID) (see Problem 3.15 for its definition). 
(b) Show that if f € C(D), then we have f|p € H*(D) if and only if f € A(D). 
(c) Show that the restriction mapping p : A(D) > C(T) given by 


frofir 


extends to an isometry from H?(D) onto L?(T). How does it relate to Prob- 
lem 3.15? 


3.18 Let A*(D) denote the subspace of L?(ID) consisting of all square integrable holo- 
morphic functions on D. The goal of this problem is to show that A?(D) is a 
Hilbert space with orthonormal basis (€n)nen given by 


€n(Z) = ( 


(a) Show that (e,,),cn is an orthonormal system in L?(D). 
Hint: Perform a computation in polar coordinates. 

(b) Show that the closed linear span Y of (én)ncn in L? (ID) is contained in A?(D). 
Hint: On the one hand, every f € Y can be written as a convergent sum 
f = nen Cnn with convergence in L?(D) (explain why). On the other hand, 
for each z € D the sum g(z) := Ynen ren ae >2" converges absolutely. Now 
use the fact that L?-convergence implies pointwise almost everywhere con- 
vergence of a subsequence to show that f(z) = g(z) for almost all z € D. 

(c) Show that if a holomorphic function f € L?(ID) satisfies (f|e,) = 0 for all 
n€N, then f =0. 
Hint: Consider the Taylor expansion of f around 0. 

(d) Combine the above to conclude that A?(D) is a Hilbert space with orthonor- 
mal basis (€n)nen- 

(e) How are the spaces A*(ID) and H?(D) related? 


1\ 1/2 
“~) Zz’, zeED,neN. 


3.19 On C we consider the measure 
1 2 
d =e ll d 
ye(z) = Te" dz, 


where dz is the Lebesgue measure on C. 
(a) Show that Yc is a probability measure satisfying {¢ |z|? dyc(z) = 1. 


Let A7(C) denote the complex vector space of all entire functions f : C > C that 
are square integrable with respect to Yc, 


[\f@Pare) <a. 
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(b) Show that A?(C) is a Hilbert space with respect to the inner product 


(Fle) = f F@#@arel). 


(c) Show that the functions 


form an orthonormal basis for A?(C). 
3.20 In this problem we continue our study of the Hilbert space A7(C). 


(a) Using the mean value theorem, show that for all w € C the mapping f > 
f(w) is continuous from A?(C) to C. Deduce that for all w € C there exists 
a unique function k,, € A?(C) such that 


w)= f bolas z)dyc(z), fEA?(C), weC. 
(b) Show that this function is given by 
k,,(z) =exp(zw), w,zeC. 


(c) Show that the orthogonal projection P in L?(C, yc) onto A?(C) is given by 


w)= fle) dye(2), fEL(C,yc), wEC. 


3.21 This problem discusses the construction and elementary properties of conditional 
expectations. Let (Q,.¥,1) be a probability space and let Y be a sub-o-algebra 
of ¥. Let 1 < p< and denote by L?(Q,¥Y) the subspace of L?(Q) consisting of 
all f € L?(Q) having a Y-measurable pointwise defined representative. 


(a) Show that L?(Q,%) is a closed subspace of L?(Q). 
(b) Let Py denote the orthogonal projection in L7(Q) onto L?(Q,Y) and fix a 


function f € L?(Q). Prove that Py f is the unique element of L7(Q;Y) such 
that the following identity holds for all f € L?(Q) andG EY: 


[ fau= [ Patan. 


Hint: f — Pg f | 1g in L?(Q). 

(c) Prove that if f € L?(Q) satisfies f > 0 u-almost everywhere, then Py f > 0 
L-almost everywhere. 

(d) Prove that if f € L?(Q) satisfies 0 < f < 1 p-almost everywhere, then 0 < 
Pg f <1 p-almost everywhere. 

(e) Prove that |Py f| < Pg|f| u-almost everywhere. 
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(f) Prove that Py restricts to a contractive projection in L*(Q) onto L”(Q;¥%) 
and extends to a contractive projection in L!(Q) onto L!(Q;Y), and that the 
properties described in parts (b)—(e) extend to functions in these spaces. Here, 
a projection is understood to be a bounded operator P satisfying P? = P. 

(g) Discuss the relation of this problem with Problem 2.17. 

(h) Give an explicit expression for Py in each of the following two cases: 


(i) Q= (0,1), F the Borel o-algebra, up the Lebesgue measure, and Y = 
{3,Q}; 
(ii) Q= (5, 5), F the Borel o-algebra, 1. the Lebesgue measure, and 
G={BEF: B=-B}. 
(i) Use the Radon—Nikodym theorem (Theorem 2.46) to give an alternative 
proof of the existence of the projection Py in L'(Q) of part (f). 


3.22 In this problem we outline alternative proof of the existence part of the Radon— 
Nikodym theorem (Theorem 2.46) based on Hilbert space methods. Let (Q,.¥, w) 
be a o-finite measure space and let the K-valued measure v on (Q,.#) be abso- 
lutely continuous with respect to pL. 

We first assume that v is a finite nonnegative measure. 
(a) Show that there exists a measurable function w € L!(Q, 1) such that w(@) > 
0 for U-almost all @ € Q. 
(b) Show that the mapping f ++ J, fdv is bounded on L7(Q,A), where A is the 
finite measure on (Q,.¥) given by A(F) := v(F) + f;, wd. Conclude that 
there exists a unique h € L7(Q,A) such that 


[ fov= [ pnea., PEP OM. 


(c) Show that 0 <h < 1 for A-almost all @ € Q. 
Hint: Apply the identity in part (b) to f = 1p with F € F. 
(d) Show that (B) = 0, where B= {@ € Q: hA(@) = I}. 
Hint: Apply the identity in part (b) to f = 1p. 
(e) Show that there exists a nonnegative function g € L!(Q, 1) such that 


vr) = | edu, Fe F. 
F 


Hint: Apply the identity in part (b) to the function f = 1+h+---+h” and use 
parts (c), (d), and the monotone convergence theorem to show that the limit 
g:=lim,.(1+h+---+h")hw exists U-almost everywhere and belongs to 
L'(Q,u). 

This proves the Radon—Nikodym theorem for finite nonnegative measures v. 


(f) Deduce from this the general case. 
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3.23 Using Zorn’s lemma, we will construct two nonequivalent Hilbertian norms on (2. 

An indexed set (v;)jey (where J is some index set) of a vector space V is called 

an algebraic basis if every v € V admits a unique (up to permutation of the terms) 

expansion of the form v = Y"_, cjvz, with n > 1 an integer and c),...,c, scalars in 

K. Thus, every v is expressed as a finite linear combination of the v;. The unique- 

ness assumption implies that the v; are linearly independent. By a straightforward 

application of Zorn’s lemma (partially order the set of all linearly independent 
subsets of V by set inclusion) every vector space has an algebraic basis. 

Select an algebraic basis (/;)jey in @ which contains the standard unit vectors. 

Now remove one of the vectors that have been added to the standard unit basis 


vectors, say h;,, and denote the resulting codimension one subspace by X. 


ig? 
(a) Prove that X is dense in £2 
Define a linear mapping @ : > K by 0=0 onX and @(hj,) := 1. 


(b) Prove that @ is not continuous. 


On X we define a new norm ||\- || as follows. Let b: 1\ {ig} — I be a bijection 
(which exists since both sets are infinite; prove this). For j,..., jn € 7 \ {io} and 
scalars c1,...,Cn € K define 
n n 
| Ye cag | = | Yi cehycin)|)- 
k=1 k=1 


This sets up an isometry B : X ~ @2. We extend this norm to @ by defining 


I|a+ cig ||? = IAI? + cl, hEX, cEK. 


This defines a norm on (2. 


(c) Show that ( is a Hilbert space with respect to the norm ||| - ||| and that X is a 
closed subspace. 

(d) Prove that X is not closed in ¢? with respect to the norm || - || (thus there is 
no analogue of Corollary 1.36 for subspaces of finite codimension). Deduce 
that || - || and ||| - ||] are not equivalent. 

(e) Does this result contradict the fact that (¢?,|| ||) and (¢?, || - ||) are isometri- 


cally isomorphic (both being separable Hilbert spaces)? 


3.24 This problem provides an example of a linear operator on ¢? which fails to be 
bounded. We return to Problem 3.23 and use the notation introduced there. Define 
the mapping 


nie 0, ¥: CjiXj ry cx; for finite sets F C I. 
icF icF \{io} 


(a) Prove that 7 is linear, satisfies nm? = 7, and has range X. 
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(b) Prove that 7 fails to be bounded. 


3.25 This problem assumes familiarity with the language of probability theory. Let 


(B;)r<[0,1) be a standard Brownian motion on a probability space (Q,.7,P). For 
each t € [0,1], let ¥; denote the o-algebra generated by the family (Bs) ,<{o,)- 


(a) Show that if0 <s<t< 1, the increment B; — B, is independent of F,. 


Let H?((0, 1) x Q) denote the subspace of L?((0, 1) x Q) consisting of all stochas- 
tic processes § = (€),<[0,1] of the form 
N-1 
E(@) = ¥ an(@)1G,4,,,)(0), 1 € [0,1], @ €Q, (3.17) 
n=0 


where 0 = fo <t, <---<ty =1 andeachay, € L?(Q) is F,,-measurable. For such 
processes we define 


1 N-1 
i &, dB, = > Gn (Br. = Be): 
0 n=0 


(b) Show that for all € € H2((0, 1) x Q) we have fy & dB, € L?(Q) and 


1 
I / E, dB, 
0 


Let H?((0,1) x Q) denote the closure of Hj ((0, 1) x Q) in L7((0,1) x Q). 


(c) Deduce that the mapping 6 +> | i &, dB; admits a unique extension to an isom- 
etry from H*((0,1) x Q) into L?(Q), the so-called It6 isometry. 


Z 2 
2a) 7 |S lli2(o.yxay 


The random variable fy dB; is called the It stochastic integral of € with respect 
to the Brownian motion (B;),<(0,1)- 


4 
Duality 


The present chapter is devoted to the study of duality of Banach spaces. We begin by 
characterising the duals of various classical Banach spaces, and then proceed to proving 
the Hahn—Banach theorems. These theorems provide the existence of functionals with 
certain desirable properties. The remainder of the chapter is concerned with application 
of these theorems. 


4.1 Duals of the Classical Banach Spaces 


Recall that the dual of a Banach space X is the Banach space X* := @(X,K). For x € X 
and x* € X*, the scalar x*(x) € K is denoted by (x,x*), that is, we write 


X* (x) =: (x,2"). 


The Hahn—Banach theorems guarantee an abundance of nontrivial functionals in the 
dual of any Banach space. In many concrete situations, however, it is possible to com- 
pletely describe the dual space. It will be our first task to do this for some classical 
Banach spaces discussed in Chapter 2. 


4.1.a Finite-Dimensional Spaces 


It is instructive to start with duality of finite-dimensional spaces. As we have seen, every 
finite-dimensional Banach space is isomorphic to K¢ for some integer d > 1. The dual 
of K@ is determined as follows. 
This manuscript will be published by Cambridge University Press in the series “Cambridge Studies in 
Advanced Mathematics”. This version is free to view and download for personal use only. Not for 


re-distribution, re-sale or use in derivative works. 
© Jan van Neerven 
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Every & € K¢@ determines an element be € (IK¢)* by the prescription 


d 
g(x) = x-§ = ¥ abn xe K4 
n=1 


Indeed, the Cauchy—Schwarz inequality implies |@z(x)| < ||||||§||, from which it fol- 
lows that $¢ is bounded and ||@¢|| < ||§ ||. Conversely, every @ € (K¢)* is of this form. 
To see this, let e,,...,e@g be the standard unit vectors of K¢ and set En = (en). Then 
& := (&,...,&3) € K¢ and, for all x = (x),...,x¢) = Yo then 


d d d 
o(x) = o( Y nen) - X %n9(en) a De nbn Hx G= d= (x). 
It follows that @ = @¢. Moreover, ||€ (7= de (e)< || || ||¢ ||. Together with the inequal- 


ity ||z|| < ||¢|] it follows that || ¢ || = |||. 
In summary, the correspondence $¢ & establishes an isometric isomorphism 


(K4)* ~ K4. 


4.1.b Sequence Spaces 


The above proof scheme can easily be extended to identify the duals of the infinite- 
dimensional sequence spaces cg and ¢?. We begin by proving that the dual of co can be 
identified with ¢'. Every € € ¢' determines an element be € (co)* by the prescription 


de (x) = y XnEn, XE CO. 
n>1 


Indeed, 
Ge (x)| < (sup fxn) Y [Sal = Wolleoll § llr, 


n> n>1 
so g¢ is bounded and ||@¢|| < ||§||1. Conversely, every $ € (co)* is of this form. To see 
this, let (€n)n>1 be the sequence of standard unit vectors of co and set &, := (e,). We 
claim that ),51 |n| < ee. To see this, choose scalars c, € K of modulus one such that 


CnEn = |En|. The sequence (c1,...,cv,0,0,...) belongs to co and has norm one, and 
N N N 
¥ LE = YS eng =0(Y cnen) < lip 
n=1 n=1 n=1 


Since N > 1 was arbitrary, this establishes the claim, with bound ||& ||; < ||@||. It follows 
that € = (€,,&,...) belongs to ¢! and for all x € co we have 


d(x) = 6 (YL anen) = Y xn (En) = Yi xnén = d¢ (x). 


n>1 n>1 n>1 
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It follows that @ = gz, and the preceding bounds combine to the norm equality ||@¢ || = 
I|§ ||1. In summary, the correspondence $¢ <> ¢ establishes an isometric isomorphism 


(co)* x el. 


In much the same way one proves that the dual of @?, 1 < p < ©, can be represented as 
¢4, where | + | = 1. More precisely, every element of (4 defines a bounded functional 
bz € (?)* of norm || Ge || < ||¢ ||p by the same formula as before, this time using Hélder’s 
inequality 


< |lallellé lla- 


165(0)1 =| snk 


Conversely, every bounded functional is of this form. To see this let (€n)n>1 be the 
sequence of standard unit vectors of ¢? and set €, := (en). We claim that (€))n>1 
belongs to £4 The case p = 1 and q = ~ is trivial, for |E,| < |||||len|| = ||@||, 2 > 1, 
so ||&||oo < ||@||. Therefore we only consider the case | < p < ©, in which case also 
1 <q<o. To prove that Y,51 |En|% < % it obviously suffices to show that 


N 


Y El? < loll NS 1. (4.1) 


n=1 
Fix N > 1 and put x™) := (c1|E,|4/?,...,ew|Ew|4/",0,0,...), where the scalars cy € K 
are chosen in such a way that cn, = |&,|. This sequence belongs to £?, with norm 


N 


OP = Ent 


n=1 


1 1_ 4; i q = 
Since 3 + a= 1 implies o +1l=gq, 


N 1/ 
=166)] < Ullal = (Y lénl*) “Id 


n=1 


N N 
Yo [Gol =| Ye enlgnlt "Ey 
n=1 n= 


and therefore (Y*_, |E,|%)!/4 < ||@||, using once more that ; + 7 = |. This proves (4.1). 
Since N > 1 was arbitrary it follows that € = (€),&,...) belongs to @% with norm 
IE |l¢ < ||@||, and for all x € €? we have 


o(x)=¢ ( y Xnen) = ys XnO(€n) = yo mnbn = d¢ (x). 
n>1 n>1 n>1 


It follows that @ = zg, and the preceding bounds combine to the norm equality ||@¢ || = 
IIS ||q. In summary, the correspondence $¢ <> ¢ establishes an isometric isomorphism 


ia 
((?)*~ 0, 1<p<o, —-+-=1., 
P 


At the end of Section 4.2 we show that this result does not extend to p =. 
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4.1.c Spaces of Continuous Functions 


Definition 4.1 (Locally compact spaces). A topological space X is called locally com- 
pact if every point x € X is contained in an open set with compact closure. 


For example, the spaces K¢ are locally compact. 

When X is a locally compact topological space, we let Co(X) denote the space of 
continuous functions f : X — K vanishing at infinity, that is, for every € > 0 there exists 
a compact set K C X such that | f(x)| < € for all x € CK. With respect to the supremum 
norm, Co(X) is a Banach space; the proof is similar to that for co. 

The space M(X) of K-valued Borel measures on X has been introduced in Section 
2.4, where it is was shown there that it is a Banach space with respect to the variation 
norm ||U|| = |w|(X). Every uw € M(X) determines a bounded functional @, € (Co(X))* 
given by 


oulf)= ff fan. 


By Proposition 2.49 it satisfies 


loulAl< fLedle| < [fella 


and therefore 


Ileull < lel. 


In what follows we assume that X is a locally compact Hausdorff space. A Borel 
measure LL € M(X) is said to be Radon if its variation |u| is Radon (see Definition 
E.20), that is, if for every Borel subset B of X and all € > 0 there is a compact set K C X 
and an open set U C X such that K C BC U and |u|(U \ K) < €; these properties are 
referred to as inner regularity with compact sets and outer regularity. By Mp(X) we 
denote the space of all Radon measures on X. It follows readily from the definitions 
that Mp(X) is a closed subspace of M(X), so it is a Banach space with respect to the 
variation norm. The next theorem identifies this space as the dual of Co(X): 


Theorem 4.2 (Riesz representation theorem). Let X be a locally compact Hausdorff 
space. For every @ € (Co(X))* there exists a unique Radon measure [t © Mp(X) such 
that = Op, that is, 


(f.6)= [ tau, fe cote). 


This measure satisfies ||1|| = ||@||. The correspondence @ + U establishes an isometric 
isomorphism 


(Co(X))° = Mr(X). 


The representing measure [Ll is nonnegative if and only if @ is positivity preserving. 
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In the special case of a locally compact metric space we have Mr(X) = M(X) by 
Proposition E.21 and we obtain an isometric isomorphism 


(Co(X))* = M(X). 
For the proof of Theorem 4.2 we need the following version of Urysohn’s lemma. 


Proposition 4.3. Let X be a locally compact Hausdorff space. If K © U C X with K 
compact and U open, then there exists a function f € C,(X) with support contained in 
U such that 0 < g <1 pointwise on X and g=1onK. 


Proof Cover K with finitely many open sets U),...,U,, each of which has compact 
closure. Then X is contained in the open set (U; U--- UU;) NU and this set has compact 
closure. Using this set instead of U, we may now appeal to Urysohn’s lemma (Proposi- 
tion C.10). 


Proof of Theorem 4.2 Uniqueness is immediate from the norm identity ||1|| = ||@]| 
which will be proved in Step 4 of the proof. 


Step 1 — We begin with the case of positivity preserving functionals @. In this step we 
prove the existence of a nonnegative representing measure “ € M(X) for such function- 
als. The Radon property of u is shown in Step 2. 

Let Y denote the collection of open subsets of X. For U € Y and f € C,(X) we write 


[x~U 


if 0 < f <1 and the support of f is a compact set contained in U. Define 


u(U) := sup{(f,o): f €C.(X), f <u} 


with the convention that u(@) := 0. Note that 


0<H(U) <u (X) <|l9]. 


Let us show that 1 is countably subadditive on Y. To this end, suppose that f < Ujs1 Uj 
with U; € Y for all j > 1. Since the support of f is compact, it is contained in some finite 
union ()j_, Uj. Choose functions g1,...,8n € Cc(X) such that gj; < Uj for j= 1,...,n 
and )"_; gj = 1 on supp(f); to see that such functions exist we argue as follows. For 
every x in the compact support of f we use Corollary C.9 to choose an open set U, with 
closure contained in );5, Uj. Intersecting this set with an open set containing x and 
with compact closure, we may assume that U, has compact closure. By compactness, 
the support of f is contained in a finite union Uj_; U,;. Any partition of unity (g Des 
relative to these sets (see Theorem C.11) has the desired properties. Using that fg; < Uj, 


6)=(FLan0) = Dy fa.) < Luu < Yay, 


j=l jz 
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This being true for all 0 < f € C,(X) satisfying f < Uj) Uj, it follows that 
u(U U;) < Yuyj) 
jel j2l 


as claimed. 
In what follows we freely use the notation and terminology introduced in Appendix 
E. Let * : 2X — [0,9] be the outer measure associated with uw through (E.1), that is, 


ur(A):= int { yuu): AC JU;, where U; € Y for all j > \ 
j21 j2i 


for A € 2* (see Lemma E.6). By the definition of an outer measure and the countable 
subadditivity of we have, for any set A € 2*, 


(A) =inf{u(U): ACU, whereU € Y}. (4.2) 
Clearly, *(A) > 0. We also note that 
y*(U) =U) forall UE ®Y, 
that is, u* extends U. This fact is used repeatedly below. 
We claim that Y is contained in the set 
My = {Ae 2%: w*(Q) =n (QNA)+p*(QNCA), Qe2*}. 


To prove this, let U € Y, that is, let U be an open subset of X. By the subadditivity 
of outer measures we have u*(Q) < u*(QNU) +u*(QNCU). The reverse inequality 
trivially holds if 4*(Q) =», so it suffices to check the inequality for Q € 2* satisfying 
L*(Q) <e. Fix an arbitrary € > 0. Choose an open set V such that Q@ C V and u(V) < 
L*(Q) + €; this is possible by (4.2). Let f,g € C.(X) satisfy 


f<UNV, W(UNV)<(f,o) +8, 
respectively 
g<VoC(suppf), u(VoC(suppf)) < (g,¢) +e. 


Such functions f and g exist by the definition of uw. Then, using the linearity of @ along 
with the facts that f+ g < V (as f and g have disjoint supports both contained in V) and 
onlu CV NC(supp f) (which follows from Q C V and supp(f) C U), 


w*(QNU) +H*(Q@NCU) < w*(UNV) + u*(V NC (supp f)) 
= W(UNV) +p(V NC (supp f)) 


(f,0) + (9,0) +2€ = (f+g,0) +2€ 
H(V)+2€ < w*(Q)+3e. 


IN. IN 
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Since € > 0 was arbitrary, this gives the desired result. 

By Theorem E.7, .@,* is a o-algebra and p1* restricts to a measure on .4%,«. Since 
& is contained in .@,*, so is o(Y) = A(X), the Borel o-algebra of X. Thus we 
find that the restriction of * to A(X) is a measure. Since we have already seen that 
u*(U) = UU) for all U € Y, by slight abuse of notation we shall denote the measure 
on &(X) thus obtained by uw. The bound u(X) < ||@|| shows that p is a finite measure. 
Nonnegativity of u follows from the nonnegativity of w*. 

Next we check that represents the functional @. To this end we first claim that if 
A,B € A(X) and g € Co(X) satisfy 14 < g < 12, then 


H(A) < (8,6) < w(B). (4.3) 


Indeed, since A is contained in the set {g > 1} and this set is contained in the open set 
{g > 1-6}, we have, for any0 <6 <1, 


w(A) <u{g> Peewee py f <{g>1-4}} 
<(45 579) = 7589) 


using the positivity of @ and the fact that on the set in > 1—6} we have f<l< 
g/(1—5) pointwise. Since 0 < 6 < 1 was arbitrary, it follows that (A) < (g,@). In the 
same way, for all 6 > 0 we have 


H(B) > u{g > 0} =sup{(f,0): fe Ce(X), F< {e>0}} > ((g-8)*,9), 


using that (g—6)* belongs to C,(X) and satisfies (g —6)* < {g > O}. Since (g—5)* 
gin Co(X) as 6 | 0, it follows that u(B) > (g,). This proves (4.3). 
Let 0 < f € Co(X), fix € > 0, and for 6 > 0 write f5(€) := min{ f(€),6}. Then 


f= Gene —Siz)- 


k>0 


There is no convergence issue here since functions in Co(X) are bounded, so at most 
finitely many terms in this sum are nonzero. From the inequalities 


Ele pscesie} S Sette — See S Eltystey, 


on the one hand we obtain 
eu{f> (k+ De} < f fecte fede < eutf > ke}, 
while combining them with (4.3) gives 


ent f > (k+ 1)€} < (fesiye — fees?) < eu tf > ke}. 
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It follows that 
| | feerne — fre I — (firstje — fre, )| <eutke <f <(k+1)e} 


and consequently 


| | fu (F.9)|< eh ufke < F< (k+ Ne} < eu(X). 
k>0 
Since € > 0 was arbitrary, this proves that fy fdu = (f,@) as desired. By the linearity 
of both sides, this identity extends to arbitrary f € Co(X). 

Step 2 — We prove next that yz is a Radon measure. Outer regularity is clear from the 
constructions, and inner regularity with compact sets will be proved in two steps: (i) 
First we prove that if U is open in X, then for every € > 0 there is a compact set K CU 
such that u(U \ K) < €; (ii) We then use this to deduce the analogous result for general 
Borel B sets in X. 

(i): Let U be open in X. Pick f € C.(X) such that f < U and fy fdu > u(U)—e 
and let K be its support. Then K C U and we have u(K) > fy fdu > w(U) —€ since 
O< f <1x. But then u(U\ K) <e. 

(ii): Suppose next that B is a Borel set in X. By outer regularity there is an open set 
V CX such that B C V and uw (V \ B) < €. By what we just proved there is a compact 
set L C X such that L C V and w(V \L) < €. Using outer regularity once more, choose 
an open set W such that V \ BC W and u(W) < e. Let K :=L\W. Then K is compact, 
contained in B, and 


U(K) = WL) — (LOW) > (u(V) — €) — w(W) > (u(B) — 2€) — u(W) > p(B) — 3e. 


It follows that u(B\ K) < 3e and the claim is proved. 

This completes the proof of the theorem for positivity preserving functionals, except 
for the norm identity ||12|| = ||@|| which will be proved, for general functionals @, in 
Step 4. 


Step 3 — The real part of a functional @ € (Co(X))* is defined, for f = u+iv € Co(X) 
with u, v real-valued, by 
Re@(f) := Re(u,$) +iRe(v, 9). 
It is clear that Re@ is additive, and in combination with the identities 
Reo((a+ bi) f) = Re o((au— bv) +: i(bu+av)) 
= Re(au— bv,o) +iRe(bu+ av, ¢) 
= (a+ bi)(Re(u,6) +iRe(v,$)) = (a+bi)Red(f) 


we see that Re@ is linear. Boundedness is clear, and therefore Reg € (Co(X))*. The 
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functional Im@ € (Co(X))* is defined similarly. Both Re ¢ and Im@ are real, in the sense 
that they map real-valued functions to real numbers, and we have ¢ = Red +iIm@. 

Suppose now that @ € (Co(X))* is real. In analogy with the formulas for the positive 
and negative parts of a real measure we define, for functions 0 < f € Co(X), 


o* (f) = sup {(g,0) : gE Co(X), OX g< f}. (4.4) 


We claim that * is the restriction of a real-linear functional on Co(X;R) of norm at 
most |||]. This will follow from Theorem 4.5. This theorem implies that the dual of 
Co(X) is a Banach lattice and gives a general formula for the positive part of func- 
tionals in the dual of a Banach lattice of which (4.4) is a special case. For the reader’s 
convenience, however, here we give a self-contained proof of the claim. 

It is clear that |@*(f)| < ||f||||@|| and 0 = + (0) < @*(f). It is also clear that 
o* (cf) =cot (f) for scalars c > 0. 1f0< 91 < fi and0< @ < fo, then0< git+m< 
fit fo, so 


o* (fit fr) > 0(g1 +82) = 0(g1) + 0(g2). 


Taking the supremum over all admissible g; and g> gives the inequality @* (f; + fo) > 
o* (fi) + 0* (fz). To prove the converse inequality let 0 < g < fit fo and set g) := 
fi Ag and go := g— gi. Then0 < gi < fi and 0< go < fa, so 


$(g) = (81) + 9(g2) < O* (fi) +O" (fa) 


and therefore @* (fi + fo) < 6* (fi) +7 (f2). This proves the additivity of @* on the 
cone of nonnegative functions in Co(X). 

For functions f € Co(X;R) we define + (f) := 67 (f*) —@*(f7). It is routine to 
check that @* is real-linear on Co(X;R). Moreover, 


|o*(f)| < max{oT(F7), O° F)} < [lollmax {FU AT = Helis. 


This completes the proof of the claim. 

The functional @~ = @* —@ is real-linear and bounded on Co(X;R) and the definition 
of ¢* implies that @~ is positive. This gives the representation @ = @* — @~ with o* 
bounded, linear, and positive. 

Since linear combinations of Radon measures are Radon, these reductions make it 
possible to apply Step 2 to obtain a Radon measure pp € Mp(X) representing @. 


Step 4 — The only thing left to be shown is the norm equality |||] = ||@||. This will 
be accomplished by invoking the Radon—Nikodym theorem (Theorem 2.46), or rather, 
the result of Example 2.48 which follows from it. It asserts that there exists a function 
h€ L'(Q,|u|) such that |h| = 1 |w|-almost everywhere and u(B) = {,Ad|q| for all 
Borel sets B C X. By the usual arguments, this implies 


[tau | sraiul 
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for all f € C.(X). 

The space C,(X) is dense in Co(X). Indeed, for given f € Co(X) and € > 0, let K bea 
compact set such that |f| < € outside K, and apply Proposition 4.3 to obtain a function 
g € Cc(X) such that 0 < g <1 pointwise on X and g = 1 on K. Then fg € C,(X) 
and || f — fgll. < €. As a variation on Remark 2.31, the Radon property of |u| and 
Proposition 4.3 imply that C.(X) is also dense in L!(X, ||). 

Combining these observations, we obtain 


loll = sup \¢¥,9)| 
Ilfll<1 
SECC(X) 
= sup | f fou] = sup | f phat] = Walla jay = Wel) = lel 
ale He - 
SEC (X fECE(X 


and the proof is complete. 


The duality between spaces of continuous functions and spaces of Borel measures 
shows how elements of Measure Theory emerge naturally from considerations involving 
only linearity and topology (namely, from the problem of finding the continuous linear 
functionals of a space of continuous functions). 


4.1.d Spaces of Integrable Functions 


Let (Q,.F, “) be a measure space and let 1 < p,q < © satisfy 5 + 7 = |. By Holder’s 
inequality, every function g € L1(Q) defines a functional @, € (L?(Q))* by setting 


= | feau, FEL?(Q), 


and we have ||¢|| < ||g||q. If 1 < p < c and the measure space is o-finite, every func- 
tional arises in this way: 


Theorem 4.4 (Dual of L’(Q)). Let (Q,.F, 1) be a O-finite measure space and let 1 < 
p<ccand + ; = 1. For every o € (LP(Q))* there exists a unique g € L4(Q,) such that 
= og, that is, 


(f,6) = ff fed, feL(Q), 


and it satisfies ||g||q = ||@||. The correspondence $, ++ g establishes an isometric iso- 


morphism 


(L°(Q))' = LQ), 1< p<, —+-=1. 
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Proof Uniqueness is immediate from the norm identity ||g||, = ||@||. The existence 
proof will be given in two steps. 


Step 1 — In this step we prove the theorem for the special case u(Q) < oe. Let @ € 
(L? (Q))* be arbitrary and fixed. Then 


V(A) :=(lu,o), AEF, 


defines a K-valued measure. Indeed, if the sets A, € F are disjoint, then by dominated 
convergence limy_+.0 vil 14, = lijieg Aj in L?(Q), and therefore 


n n 
ee OT are 
j2l j=l j=l 

by the boundedness of @. Clearly v is absolutely continuous with respect to . By the 
Radon-Nikodym theorem (Theorem 2.46) we have v = gd for some g € L!(Q). By 
the definition of v and linearity, this means that 


(f,0) == fgdwu for all simple functions f. (4.5) 
Q 


We wish to prove that g € L47(Q) with ||g||4 < ||@|| and that the identity (f,) = Jo fgdu 
holds for all f € L?(Q). For n = 1,2,... let gy := glo, with Q, = {g < n}. These 
functions are bounded and for all simple functions f we have, by (4.5), 


| [fendu| =| f Plo,edu| =|(s10,,0)1< [LMololloll < Iiflellél 


Since the simple functions are dense in L? (Q), Proposition 2.26 implies that g, € L4(Q) 
and ||gn||q¢ < ||@||. This being true for all n > 1, Fatou’s lemma implies that g € L7(Q) 
and |léllq < Ill 

Now that we know this, the density of the simple functions in L?(Q) and Hélder’s 
inequality imply that (4.5) extends to arbitrary f € L?(Q). This completes the proof of 
the theorem in the case U(Q) < o%, 


Step 2 — The general o-finite case follows by an exhaustion argument as follows. 
Choose an increasing sequence Q; C Q2 C ... of sets of finite measure such that 
Uns 1 Qn = Q. By restriction to functions supported on Q,, every @ € (L?(Q))* restricts 
to a functional in (L?(Q,,))*, denoted by @,, of norm ||@,|| < ||@||. By the previous step, 
, is represented by a unique function g, € L4(Q,) of norm ||gn||q < ||@nl| < ||@||. More- 
over, by uniqueness we see that if m <n, then gn|o,, = Ym, Since both represent @,,. We 
can thus define a measurable function g : Q — K by setting g := g, on Q, forn > 1. 
This function satisfies ||g||7 = sup, ||Sn|lq¢ < ||@||- Moreover, if f ¢ L?(Q), then by the 
continuity of @ and the dominated convergence theorem, 


(F.9) = lim (1o,f,9) = lim [) fend = lim [ feau = [| fed. 
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For | < p < the o-finiteness assumption can be omitted; see Problem 4.3. 


4.1.e Hilbert Spaces 


The Riesz representation theorem (Theorem 3.15) states that every bounded functional 
on a Hilbert space H is of the form y, for some unique h € H, where 


Wn(g) =(glh), g EH. 


Moreover, we have equality of norms ||/|| = || W,||. The identification yy, + A therefore 
provides a bijective and norm-preserving correspondence 


H* «+H. 


It is important to observe that this correspondence is linear if K = R, but conjugate- 
linear if K = C. This is a consequence of the conjugate-linearity of inner products with 
respect to their second variable. Indeed, from Wen(x) = (x|ch) = ¢(x|h) = Cy, (x) it fol- 
lows that 


Wen = CW. 


In contrast, the correspondence @, ++ x in each of the Sections 4.1.a—4. 1.d is linear both 
when K = R and K=C. 


4.1.f Banach Lattices 
The objective of this section is to prove the following theorem. 
Theorem 4.5. With respect to the natural partial order given by 
x* <y* & (x,x*) < (x, y") forall O0< xEX 


the dual X* of a Banach lattice X is a Banach lattice. Moreover, for all 0 < x € X we 
have 


(x,x* Ay*) = inf{ (x—y, x )+0,y¥): OS V< cer 
(x,x* Vy") = sup{ (x —y,x") + (yy): OS VK cae 
A special case of the second formula has already been encountered in (4.4). 


For the proof of the theorem we need the following lemma. If V is a vector lattice, 
then for0 < ve V we write [0,v] :={uEeV:0<uK<y}. 


Lemma 4.6 (Decomposition property). If V is a vector lattice, then for all0 <v,v' EV 
we have 


(0, v] + [0,v'] = [0,v+v’]. 
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Proof Let w€ [0,v+v]; we must show that there exist u € [0,v] and w’ € [0,v’] such 
that u+u’ = w. We claim that uv :=v/Aw and u' := w—u have the required properties. 
It is clear that u € [0,v], uw’ > 0, and w+ u’ = w, and by Proposition 2.52(2) we have 


= ((v —w) +¥) A (ov —w) +) = (VEY) —w) Av’ 3 0; 


this proves that wu’ € [0,v’]. 
Proof of Theorem 4.5 First note that if 0 < y < x, then also 0 < x— y < x and therefore 
IIyl| < [|| and ||x— yl] < |||]. It follows that 

(x —y,") + yy") | < [all + Ib"), 


showing that the infima and suprema on the right-hand sides of the formulas in the 
statement of the theorem are finite. Lemma 4.6 implies that the right-hand sides are 
additive on the positive cone X* of X. To see this, let x,x’ € X. Then 


inf{ (x +2! —y,x*) + (y,y*) : y € [0,x+2']} 
= inf { (x—u,x*) + (x! — ux") + (uy*) + (u,y*) sw € [0,2], w! € [0,x']} 
= inf{ (x—u,x*) + (u,y*) : ul € [0,x]} +inf{ (x —u',x*) + (w,y*) : ul € [0,x']}. 


The corresponding identity for the suprema is proved in the same way. Since the right- 
hand sides are also homogeneous with respect to scalar multiplication by nonnegative 
scalars, they uniquely extend to linear mappings from X to R and therefore define ele- 
ments of X*. Thus we may define functionals x* A y* and x* V y* in X* by the right-hand 
sides of the formulas in the statement of the theorem. 

We begin by showing that the functionals x* A y* and x* V y* thus defined are the 
greatest lower bound and the least upper bound for the pair {x*,y*}, respectively. We 
will present the argument for x* A y*; the proof for x* V y* is entirely similar. 

It is clear that for all x > 0 we have (x,x* Ay*) < (x,x*). This means that x* A y* < x*, 
and in the same way we see that x* \ y* < y*. This shows that x* A y* is a lower bound 
for the pair {x*,y*}. To prove that it is the greatest lower bound we must show that if 
Zz <x* and z* < y* then z* <x* A y* But this is easy: if 0 < y < x, then 


(x,2") = (x—y, 2°) + (y,2") < (xy, x") + (y,9"), 
and therefore for all x > 0 we obtain 
(x,2*) < inf {(x—y,x*) + (y,y*) 2 OS y <x} = (xa Ay*), 


that is, z* << x*Ay* 
This proves that the pair (X*, <) is a lattice. It is clear from the definition of the partial 
order < on X* that if x*,y* € X* satisfy x* < y*, then cx* < cy* for allO <c € R and 
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x*+z2* < y* +2* It follows that (X*,<) is a vector lattice and that the identities in the 
statement of the theorem are satisfied. 

Since X* is complete, all that remains to be shown is that |x*| < |y*| implies ||x*|| < 
||y*||. The assumption is equivalent to the statement that for all x > 0 we have 


sup{ (x—y,x") — (y,2°) : OS y <a} < sup{(x—y,y*) — yy"): OS y <x}, 
that is, 
sup{(z,x*): —x<z<x} < supf{(z,y*): —x<z<x}. 
This, combined with the identity 


I|z*|| = sup (x,z*) = sup sup{(z,z*): —x<z<x} 
ilx||<1 Ilx||<1 


(which follows from the fact that —x < z < x implies ||z|| < ||x||), gives ||x*|| < ||y*]| as 
desired. 


4.2 The Hahn-Banach Extension Theorem 


We now turn to one of the main pillars of Functional Analysis, the Hahn—Banach theo- 
rem. This is a collective name for a number of closely related results, all of which assert 
the existence of certain nontrivial functionals with desirable properties. These results 
come in two flavours: as extension theorems asserting the extendability of functionals 
that are given a priori on a subspace and as separation theorems asserting that certain 
disjoint subsets can be separated by means of functionals. 

The present section is concerned with Hahn—Banach extension theorems. We begin 
with a version for real vector spaces whose proof exploits the order structure of the real 
line. 


Theorem 4.7 (Hahn—Banach extension theorem for real vector spaces). Let V be a 
real vector space and let p: V — R be sublinear, that is, for all v,v’ € V and t > 0 we 
have 


P(v+v') < p(v)+p(v), p(tv) =tp(v). 
IfW CV is a subspace and ¢ :W — R is a linear mapping satisfying 
o(w) <p), wew, 
then there exists a linear mapping ® : V — R that extends 6, and satisfies 


@P(v) < p(y), ve’. 
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Proof We may assume that W is a proper subspace of V. Fix w,w’ € W andveEV\W. 
From ¢(w) + 6(w’) = o(w+w’) < p(w+w’) and p(w+w’) < p(w—v) + p(v+w’) we 
obtain @(w) — p(w—v) < p(w’ +v) — o(w’). With & := sup,,cw(@(w) — p(w—v)) we 
therefore have 


o(w)—-a<p(w—v), o(W)+a<p(w'+t+yv), ww ew. 


Let W, denote the linear span of W and v. Then @; : W; > R, @)(w+tv) := o(w) +ta, 
is linear, extends @ and satisfies ¢1(w1) < p(w1) for all w1 € Wy. To see this, note that 
for w € W andt > 0, 


o1(w+tv) = o(w) +10 =t((t-'w) +a) <tp(t!w4+v) = p(w+tv) 
and 
$1 (w—tv) = o(w) — ta =1(9(t!w) — ao) <tp(t'w—v) = p(w) 


while for t = 0 we have @1(w) = @(w). This proves that @1(w1) < p(w1) for all w; € W,. 
The proof can now be finished by an appeal to Zorn’s lemma (Lemma A.3), applied to 
all linear extensions @’ of @ satisfying the inequality @’ < p. 


This result is used to give a second version of the theorem which is also valid over 
the complex scalars. 


Theorem 4.8 (Hahn—Banach extension theorem for vector spaces). Let V be a (real or 
complex) vector space and let p: V — [0,°°) be a seminorm, that is, for all v,v' € V and 
t € K we have 


P(v+v') < p(y) +P), pl(tv) = |t|p(v). 
If W is a subspace of V and @: W — K is a linear mapping satisfying 
lo(w)l< pw), wew, 
then there exists a linear mapping ® : V + K that extends @ and satisfies 
IP(v)| < p(v), veEV. 


Proof First we consider the case K = R. The assumptions imply @(w) < p(w) for all 
w € W, and therefore by Theorem 4.7 the mapping @ : W — R admits a linear extension 
®:V — R satisfying B(v) < p(v) for all v € V. Also, —®(v) = B(—v) < p(—v) = p(y), 
and therefore |®(v)| < p(v) for all v eV. 

Next we consider the case K = C. Let us write ¢6 = Red +ilm@, where Re@ and 
Im @ are the real and imaginary parts of @; these functions are real-linear. From 


¢(x) = —io (ix) = —i(Re @ (ix) +iIm@ (ix)) = Im@ (ix) —iRe g(ix), 
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with Im@ (ix) and Re@ (ix) real, we infer that Im @(x) = — Re @ (ix). Hence, 
$(x) = Re@(x) —iRe@ (ix). 


The real-valued function y := Re@ satisfies the assumptions of the previous theorem 
and thus extends to a real-linear mapping V : V > R satisfying ‘¥ < p. We now define 


P(v) = V(v) — (iv). 
Then @ extends @, ® is real-linear, and since also 
P(iv) = Viv) — iY (—v) = Viv) +2 (v) = i(Y(v) — PY (iv)) = iD(), 


® is actually complex-linear. Finally, for t € C with |r| = 1 such that t®(v) = |®(v)|, 


&(v)| = 1(v) = @(rv) 2 WEer) < ptr) = [e|p(v) = plo). 


Here («) follows from the definition of ®, noting that ®(tv) = |®(v)| is nonnegative and 
therefore ®(tv) = Re®(tv), while at the same time Re ®(tv) = V(tv) by the definition 
of ® and the fact that Y is real-valued. 


In the setting of normed spaces, from Theorem 4.8 we infer the following result. 


Theorem 4.9 (Hahn—Banach extension theorem for Banach spaces). Let X be anormed 
space and let Y C X be a subspace. Then every functional y* € Y* has an extension to 
a functional x* € X* that satisfies 


llx"Il = I" Il. 


Here, of course, ||x*|| is the norm of x* as an element of X* and ||y*|| is the norm of 
y* as an element of Y*. Such obvious conventions will be in place throughout the text. 


Proof Given a functional y* € Y*, we apply Theorem 4.8 to V = X, W =Y, @(y) := 
(y,y*) for y € Y, and p(x) := ||x||||y*|| for x eX. 


Remark 4.10. The proof of Theorem 4.7 depends on the Axiom of Choice through the 
use of Zorn’s lemma. If X is separable and a countable dense sequence (X,)n>1 is given, 
Theorem 4.9 can be proved without invoking Zorn’s lemma as follows. Revisiting the 
proof of Theorem 4.7, starting from a functional y* € Y* one defines Y,, to be the span 
of Y and {x1,...,x,} and inductively extends y* to Y;, then to Y2, and so forth. On the 
span of the spaces Y,, n > 1, we thus obtain a well defined functional of norm at most 
||y*||. Since this subspace is dense, by Proposition 1.18 this functional uniquely extends 
to a functional with the same norm on all of X. 


Recall the identity 


|lx"|| = sup |(x,x")], 
IlxI|<1 
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which is nothing but the definition of the operator norm of x* as an element of 2(X,K). 
As a consequence of the Hahn—Banach theorem we obtain the following dual expression 
for the norm of elements x € X : 


Corollary 4.11. For all x € X we have 


I[x|| = sup |(x,2")|- 
ales! 


In particular, if (x,x*) = 0 for all x* € X*, then x =0. 
Proof Fix an arbitrary x € X. If x = 0 the asserted identity trivially holds, so we may 
assume that x ¢ 0. Let Y be the one-dimensional subspace of X spanned by x and define 


yg € Y* by (tx, yG) = t||x||. Then ||yo|| = 1. Let x5 € X* be a Hahn—Banach extension 
provided by Theorem 4.9, that is, xg|y = yg and ||x5|| = ||yo|| = 1. Then 


Ill] = y0) = Gx) < sup |(x,")], 
ak 


while trivially 


sup |(x,x")| < sup |[a||[}x"I] = |b 
aks! ales! 


The next application of Theorem 4.9 provides a condition for recognising proper 
closed subspaces. 


Corollary 4.12. If Y is a proper closed subspace of a Banach space X, then for every 
Xo € X \¥ there exists an x* € X* such that 


(xo,x") #0 and (y,x*) =0 forall y €Y. 


Proof Fix an element xo € X \ Y. Without loss of generality we may assume that 
||xo|| = 1. Let Xo denote the span of Y and x9. On Xo we can uniquely define a lin- 
ear scalar-valued mapping @ by declaring @(y) := 0 for all y € Y and (x0) := 1. The 
idea is to prove that @ : Xp —> Kis bounded. Once this has been shown, the result follows 
from the Hahn—Banach extension theorem. 

We claim that there is a constant C > 0 such that 


I|xo +y|| 2 Cllxol], y € Y. 
If such a constant does not exist, for every n > 1 one can find a y, € Y so that 
1 
\|x0 +ynl| < A loll, n= 1525.2; 


Then lim,-+. ||xo + yn|| = 0, and therefore x9 € Y = Y. This contradiction proves the 
claim. 
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By the claim, for any scalara € K andy € Y, 
ljaxo + yl] = lalllxo + a7y|] > Clalllxol] = Cllaxoll. 
Hence 


1 
|@ (ax0 +y)| = lal = lalllxo|l = |laxoll < Glaxo + yIl- 


This proves that @ is bounded on Xo and ||@||x* < 1/C. 


Definition 4.13 (Complemented subspaces). A closed linear subspace Xo of a normed 
space X is said to be complemented if there exists a closed linear subspace X of X such 
that 


e Xj+X, =X; 
e XoNX, = {o}. 


Here, Xo +X) := {x0 +x1 : x90 € Xo, x1 € X}. In this situation we have 
X=X0xXj 
as a direct sum in the sense discussed in Section 1.1.b. 
Definition 4.14 (Projections). A projection is an operator P € &(X) satisfying P? = P. 


Notice that boundedness of P is taken to be part of the definition. If P is a projection, 
then so is J — P and the range R(P) of P equals the null space N(J — P). This implies 
that R(P) is closed and we have a direct sum decomposition 


X =N(P)@R(P). 
Thus we have shown the following simple result: 


Proposition 4.15. [fa closed subspace Xo of a normed space X is the range of a pro- 
jection in X, then Xo is complemented. 


Conversely, if X = Xo @ Xj is a direct sum decomposition of a Banach space, then 
the natural projections associated with it are bounded; this will be proved in the next 
chapter (see Proposition 5.12). 

We have seen in Corollary 1.36 that finite-dimensional subspaces are always closed. 
As an application of the Hahn—Banach theorem we prove next the stronger assertion that 
they are always complemented. For later use we also include an analogue for subspaces 
of finite codimension, which does not require the use of the Hahn—Banach theorem. A 
subspace Xo of a Banach space X is said to have finite codimension if there exists a finite- 
dimensional subspace Y of X such that Xp 1 Y = {0} and Xp + Y =X. In this situation 
we define the codimension of Xo to be the dimension of Y and denote this number by 
codim Xo. The following argument shows that this number is well defined. If Yo and Y, 
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are subspaces with the said properties, then for every yo € Yo there are unique x) € X 
and y; € Y; such that yp = x9 + y1. The mapping yo +> y; from Yo to Y, is easily seen to 
be linear. By the same procedure we obtain a well defined mapping from Y, to Yo, and it 
is clear that these mappings are each other’s inverses. Hence they are isomorphisms of 
finite-dimensional vector spaces and therefore dim Yo = dimY. 

Subspaces of finite dimension are closed, but subspaces of finite codimension need 
not be closed: in Problem 3.23 a dense subspace of ¢* with codimension one is con- 
structed, and such subspaces cannot be closed. If Xo is a closed subspace of finite co- 
dimension, then 


codim Xo = dimX /Xp. 


Proposition 4.16. Let Y be a subspace Y of a normed space X. Then the following 
assertions hold: 


(1) ifdim(Y) < », then Y is closed and complemented; 
(2) if codim(Y) < and ¥ is closed, then Y is complemented. 


Proof (1): Let Y bea finite-dimensional subspace of X. By Corollary 1.36, Y is closed. 
To prove that Y is complemented we show that Y is the range of a projection in X. 

Let Ont 4 be a basis for Y. Then every y € Y admits a unique representation y = 
v1 cn(y)¥n with coefficients c,(y) € K. The mappings cy : y +4 Cn(y) are linear, and 
since linear mappings on finite-dimensional normed spaces are bounded, we have cy, € 
Y* By the Hahn—Banach theorem we may extend each c, to a functional x, € X*. 

Consider the (bounded) linear operator P on X defined by 


N 


Px:= Lae 


n=1 
It is clear that P maps X into Y and from (ym,x*%) = On we see that 


N 
Pym = ¥ (Ym Xn) Yn = Ym- 
n=1 
This shows that P maps X onto Y. The preceding identity, applied to the element Px € Y, 
also shows that P?x = P(Px) = Px, so P is a projection. 


(2): Let Y be a closed subspace of X of finite codimension d = dimX /Y and let 
&1,...,&a be a basis for X/Y. Choose x,...,xg such that gx, = &,, where q: X > X/Y 
is the quotient mapping. These vectors are linearly independent, so their linear span 
Z in X has dimension d; in particular, Z is a closed subspace of X since it is finite- 
dimensional. If x € YNZ, say x = yy CnXn, then 0 = gx = yy Cn6n and therefore 
Cn = 0 for all n = 1,...,N by linear independence of the &,, so x = 0. It follows that 
YNZ = {0}. 

Fix an arbitrary x € X and define the scalars c, € K by requiring that gx = Y*_ cnEn, 
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and set z:= So eae and y:=x—z. Then gy=Oandze€Z,soye€Y andx=y+ze 
Y +Z. It follows that X = Y + Z. This proves that we have a direct sum decomposition 


X=YOZ. 


We next identify the duals of closed subspaces and quotients. For this purpose we 
need the first part of the following definition. The second is included for reasons of 
symmetry of presentation and will be needed later. 


Definition 4.17 (Annihilators and pre-annihilators). Let X be a Banach space. 
(i) The annihilator of a subset A C X is the set 
AYP eX! ee) 0, 2eAl 
(ii) The pre-annihilator of a subset B C X™ is the set 
“B:={xEX: (xx*)=0, x* Bh. 


Proposition 4.18. Let X be a Banach space and let Y be a closed subspace of X. Then 
Y+ is a closed subspace of X* and we have the following assertions: 


(1) the mapping i: X*/Y+ > Y* defined by i(x* +Y+) :=x*|y is well defined and 
induces an isometric isomorphism 


Ye axry ts 


(2) the mapping j : Y+ — (X/Y)* defined by jx*(x+Y) := (x,x*) is well defined and 
induces an isometric isomorphism 


(X/Y)* ~ Yt, 


Proof The easy proof that Y+ is a closed subspace of X* is left as an exercise. 


(1): Let y* € Y* be given, and let x* € X* be an extension with the same norm as 
provided by the Hahn—Banach theorem. If @ € Y+, then x* and x* +@ both restrict 
to y*. This means that we obtain a well defined linear surjection from X*/Y+ to Y* 
This mapping is also injective, for if x* + Y+ is mapped to the zero element of Y*, then 
x*|y =O and therefore x* € Y+, so x* +Y7 is the zero element of X*/Y+. We must show 
that the resulting bijection is an isometry. On the one hand, 


[PP Mees = inf Ix + Ol < [hl Ib" = Wee = 1G + Y)I- 


On the other hand, for all @ € Y + we have 


Ile" + YI = [yl = sup |{y,y")] = sup |{y,x" +) < |" +4 
Ilyl|<1 lly||S1 


4.2 The Hahn—Banach Extension Theorem 135 
and therefore, taking the infimum over all @ € yt 
|G" +Y) || < inf |x" + 9] = [bP FY yey 
geyt 
(2): It is clear that j is well defined. Fix an arbitrary x* ¢ Y+. Given € > 0, choose 


xo € X such that ||xo + ¥||xy = 1 and || jx*|| < |(x0 + ¥, jx*)| + €. Choose yo € Y such 
that ||xo + yo|| < 1+. Then, 


| (x0 +, x")| = |(%0,x")] = (40 + Yo,x")| < |]x0 + yolllle" ll < +) [I'll 
It follows that || jx*||(x yy» < (1 + €)||x*|| + €. Since € > 0 was arbitrary, we find that 
Ili" lexyyye S le" 


In the converse direction we have 


ix" = sup | +Y¥, jx") = sup |(x,2")| > sup |(,2")| = [la"l, 
le+¥llxjr<1 Ie +¥llxjv<t Isls 


where we used that ||x|| < 1 implies ||x+Y||x/y <1. 

It follows that 7 is isometric. To see that it is also surjective, let @ € (X/Y)* be given. 
The linear mapping x > @(x+Y) is bounded, noting that both x» x+Y and @ are 
bounded. It thus defines an element XG € X*, and this functional annihilates Y. From 


(tab Fs JX) = (x9) = (x+Y,) 


it follows that jx, = 9. 


The Hahn—Banach theorem, through Corollary 4.11, offers a technique to reduce cer- 
tain vector-valued questions to their scalar-valued counterparts. By way of example we 
demonstrate this technique by reproving some calculus rules for the vector-valued Rie- 
mann integral of Proposition 1.45. A second example is given in Problem 4.2 where the 
Cauchy integral formula is extended to vector-valued holomorphic functions. 


Second proof of Proposition 1.45, parts (2) and (3). (2): If f(to) A f(t) for certain 
to #1 in I, Corollary 4.11 provides us with a functional x* € X* such that (f(to),x*) # 
(f (t1),x*). Consider the scalar-valued function (f,x*)(t) := (f(t),x*) obtained by ap- 
plying X* pointwise. This function is continuous on [0,1] and continuously differen- 
tiable on (0,1) with (f,x*)’ = (f',x*) = 0. Therefore (f,x*) is constant by the scalar- 
valued version of the proposition. This contradicts the choice of x*. 
(3): By the scalar-valued version of the proposition, 
1 


(F01) —700)— f starr") = (0).*)— FO).2*) — [/(.x*) ae =0 


0 


for all x* € X* Corollary 4.11 implies that f(1) — f(0) — i f(t) dt =0. 
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Using duality we can give the following version of the Pettis measurability theorem 
(Theorem 1.47): 


Theorem 4.19 (Pettis measurability theorem, second version). A function f :Q.— X is 
strongly measurable if and only if f takes its values in a separable closed subspace of 
X and is weakly measurable, that is, (f,x*) : Q — K is measurable for all x* € X* 


Proof The ‘only if’ part follows from the first version of the Pettis measurability the- 
orem and the trivial fact that strong measurability implies weak measurability. For the 
‘if’ part, choose a dense sequence (x,),>1 in a closed separable subspace Xo of X where 
f takes its values. By the Hahn—Banach theorem, for every k > | there is a unit vector 
xq € X* such that | (xx,x;)| = ||xx||. Then for all k > 1 we have sup,,s; |(xx,Xn)| = |lxxll. 
and by a simple approximation argument this implies that sup,,5.; |(x,x;,)| = ||x|| for all 
x € Xo. Then, for all xo € Xo, 
c+ || f(@) — xoll = sup |(f(@) — x0,%n)| 


n>1 


is a measurable function. Now the result follows from Theorem 1.47. 


Theorem 4.19 is accompanied by the following uniqueness result. 


Proposition 4.20. Let (Q,.F, UW) be a measure space. If f :Q— X is a strongly measur- 
able function and for all x* € X* we have (f,x*) = 0 u-almost everywhere, then f =0 
L-almost everywhere. 


In the same way one proves that if (f,x*) = 0 pointwise for all x* € X*, then f = 0 
pointwise. 


Proof Let (x*)n>1 be a sequence in X* separating the points of a closed subspace Xo 
in which f takes its values; such a sequence exists by the argument in the proof of 
Theorem 4.19. Since (f,x*) = 0 outside a j1-null set N,, we conclude that f = 0 on the 
complement of the p-null set U,1 Nn- 


Corollary 4.11 has the interesting consequence that every Banach space can be iso- 
metrically identified with a closed subspace of the bi-dual X** := (X*)* in a natural 
way. More specifically, given an element x € X we define a mapping Jx : X* + K by 


Jx(x*) t= (x, x"). 


It is clear that this mapping is bounded and therefore it defines an element of the bi-dual 
X**, By the corollary, its norm is given by 


||Jx|| = sup |{x",Jx)| = sup |(x,x")| = |], 


Ilx*||<1 I|x*||<1 


and therefore the mapping J : x Jx is isometric. It is also linear, and therefore we have 
proved: 
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Proposition 4.21 (Isometric embedding into the bi-dual). The operator J is an isometric 
embedding of X into X*™*. 


The image of X under J is closed in X**; this is immediate from the fact that J is 
isometric. It may happen that J(X) is a proper subspace of X**. For instance, the bi- 
dual of co is €°. Examples of Banach spaces for which we have J(X) = X** include all 
Hilbert spaces and the spaces ¢? and L?(Q) for 1 < p < e%. Spaces with this property are 
called reflexive and enjoy some pleasant properties, some of which will be discussed in 
Section 4.7.b. 

We conclude this section by filling in a detail that was left open in our treatment of 
the duality of @?, namely, that the duality (¢?)* = €% with e + 7 = 1, which has been 
shown to hold for 1 < p < ~ in Section 4.1.b, does not hold for p = ©. 

Consider the closed subspace Y of ¢* consisting of all convergent sequences, and 
define y* € Y* as 


(y,y") := lim yn 
for y= (Yn)n>1 € Y. Let x* € (€°)* be any Hahn—Banach extension of y*. We claim that 
there exists no z € ¢! such that (z,x) = (x,x*) for all x € &* Indeed, let z € ¢ be given. 


Given 0 < € < 1 we can choose N > 1 so large that 5, |Zn| < €. Consider now the 
sequence i= (0,0,...,0,1,1,1,...) € ¥, with NV zeroes at the beginning. Then 


(Ma") = OM y*) = lim x = 1 
while on the other hand 


IKz.")| = 


Yarls|ye 


n>1 n>N 


This shows that (z,xV) A (x%,x*). 


<y lal <e<l. 
n>N 


4.3 Adjoint Operators 


The Hahn—Banach theorem will now be used to show that when X and Y are Banach 
spaces and T € &(X,Y) is a bounded operator, there exists a unique bounded operator 
T* € £(Y*,X*) of norm ||T*|| = ||T|| such that 

(Tx,y*) = (x,T*y*) 


for all x € X and y* € Y*. When X and Y are Hilbert spaces, the Riesz representation 
theorem will be used to prove the existence of a unique bounded operator T* € L(Y,X) 
of norm ||7*|| = ||7'|| such that 


(Txly) =(a|T*y), x,y EX. 
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4.3.a The Banach Space Adjoint 
Let X and Y be Banach spaces. 


Proposition 4.22. For every bounded operator T © &(X,Y) there exists a unique 
bounded operator T* € £&(Y*,X*) such that 


(Tay \= ea T*y\,, EX, y CY". (4.6) 
Furthermore, 
|7*|| = ||7 I. 


Proof The idea is to take the left-hand side of (4.6) as a definition for the operator 
defined by the right-hand side. More precisely, for any given y* € Y* we may define a 
linear mapping T*y* : X > K by 


(T*y")x := (Tx,y*). 
This mapping is bounded, of norm ||T*y*|| < ||T'||||»*||, since 
(Ty al S [THUD STM". 


Accordingly T*y* defines an element of X*. The resulting mapping T7* : Y* — X* which 
maps y* € Y* to the element T*y* € X* is linear. By the above estimate T* is bounded, 
of norm ||7*|| < ||7'||. It is clear from the definitions that (Tx,y*) = (x,T*y*) for all 
xeXandy*eyY* 

Turning to uniqueness, if S : Y* + X* is an operator satisfying (Tx, y*) = (x, Sy*) for 
all x € X and y* € Y*, then (x, T*y*) = (x, Sy*) for all x € X and y* € Y*, so T*y* = Sy* 
for all y* € Y*, and so T* =S. 

Finally, 

|7"|| = sup ||7"y"]| = sup sup |(x,7"y")| 
Ily*I|<1 Ily* <1 |lx]|<1 


= sup sup |(7x,y")|= sup ||Tx|| = |IT|I, 


Ilxl| <1 |ly*||<1 Ill] <1 


using Corollary 4.11 in the penultimate step. 


It is clear that Iy = Iy«, where Jy and Iy« are the identity operators on X and X*, 
respectively. For all 7,7) € (X,Y) and ci,co € K we have 


(aN +c2Tr)* = aly +c2Ty 
and for all T € @(X,Y) and S € L(Y,Z) we have 
(SoT)* =T*oS*. 


Definition 4.23 (Adjoint operator). The bounded operator T* is called the adjoint of T. 
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Example 4.24. Let X = K", Y = K”, and let A € &(K",K”). With respect to the stan- 
dard unit bases we may represent A as an (m x n) matrix with coefficients aj; € IK. With 
respect to the same basis, and using the identification ¢ <> @¢ of Section 4. 1.a, its adjoint 
A* may be represented as the (n x m) matrix with coefficients a. Stated differently, the 
matrix associated with A®* is the transpose of the matrix associated with A. 


Example 4.25. The adjoint of the kernel operator 7; on L?(0, 1) given by 


Tf (t) = [Hess 


where we assume that k € L?((0,1) x (0,1)) (see Example 1.30) is the kernel operator 
Tj, on L?(0, 1) given by 


Th 8 ( n= k*(t,s)g(s)ds with k*(t,s) =k(s,t). 


Example 4.26. Let 1 < p< and 4 mG - = 1. The adjoint of the left (right) shift on ¢? 
is the right (left) shift on 04 The aioli of the left (right) translation on L?(R) is the 
right (left) translation on L4(R). 


As a first application we prove that the duals of (isometrically) isomorphic Banach 
spaces are (isometrically) isomorphic in a natural way: 


Proposition 4.27. If i: X — Y is an (isometric) isomorphism of the Banach spaces X 
and Y, then the adjoint operator i* : Y* — X* is an (isometric) isomorphism of their 
duals. 


Proof If i*y* = 0, then (ix, y*) = 0 for all x € X, so (y,y*) = 0 for all y € Y since i is 
surjective, and therefore y* = 0. It follows that i* is injective. From x* = (77! 0 i)*x* = 
i*((i-!)*x*) we see that i* is surjective as well. It follows that i* is a bounded bijection 
and its inverse (i*)~! = (i~!)* is bounded, so 7* is an isomorphism. If i is isometric, 
then from 


|" y"I = sup |(x,Fy")| = sup |(ix,y")| = sup [(y,y")] = [lll 


I|x|=1 iJix||=1 yil=1 


we see that i* is isometric. 


We conclude with a simple observation about the bi-adjoint operator T** := (T*)*. 


Identifying X with a closed subspace of X** by means of the natural isometric embed- 
ding J : X — X** (see Proposition 4.21), the restriction of T** to X equals T. Indeed, 
denoting by J : X — X** the natural embedding, the claim follows from 


(y", PM Ix) = (T"y", Ix) = (x, Ty") = (Tx,y") = 0" JTx) 


so that T**Jx = JTx as claimed. 
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4.3.b The Hilbert Space Adjoint 


Let H and K be Hilbert spaces. If T € &(H,K) is a bounded operator, its adjoint T* € 
-£2(K*,H™) is a bounded operator acting in the reverse direction between their duals. 
By the Riesz representation theorem, the duals H* and K* can be canonically identified 
with H and K. Under these identifications, the adjoint of an operator T € @(H,K) can 
be re-interpreted as an operator acting from K to H. Although the identifications are 
conjugate-linear, as an operator from K to H the adjoint of T is nevertheless linear. This 
is the content of the next proposition which, incidentally, admits a straightforward direct 
proof which does not call upon the Hahn—Banach theorem. 


Proposition 4.28. For every bounded operator T © &(H,K) there exists a unique 
bounded operator T* € &(K,H) such that 


(Txly)=(a|T*y), xe HH, yek. 
Furthermore, 
1/2 

TI] =P" = rer 

Proof Let y € K be fixed and define a mapping @ = ¢,7 : H — K by 
$ (x) = (Try). 

From |¢(x)| < ||Tx||lyl] < |]TII||xI|||yl] we see that @ is bounded with norm at most 
||7'||||y||. Hence by the Riesz representation theorem there is a unique element T*y € H 
with norm ||7*y|| = ||@|| such that 


(x) = GIT"). 
Combining the two identities we obtain (Tx|y) = (x|T*y). 
We must show that the mapping T* : y+> T*y is linear and bounded. Additivity is 
easy and homogeneity with respect to scalar multiplication follows from 


(x|T*(cy)) = (Txley) =2(Txly) = ex|T*y) = (ley), 


which implies T* (cy) = cT*y. 
Next we show that T* is bounded. This follows from what we already know. Indeed, 
we have 


IT" yl] = loll < TMI 


so T* is bounded of norm ||T*|| < ||7||. Writing T** := (T*)*, from 


(T™*xly) = O|T*x) = (T*ylx) = (XIT"Y) = (Taly) 


it follows that T**x = Tx for all x € H and therefore T** = T. Hence, by what we just 
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proved applied to 7%, ||7|| = ||T**|| < ||7*||. We conclude that equality holds: ||T|| = 
ta 

Next we prove the identity |/7*T7||!/? = ||7||. Clearly ||7*7|| < ||| ||7*|| = ||7||?, and 
in the converse direction we have 


|7*T || = sup ||T*Tx|| = sup sup |(T*Tx\y)| 


Ilxl|<1 x||<1 |lyll<1 


= sup sup |(T*Txly)| 


y|<1 |[x<1 


= sup sup |(T2|Ty)| 


y|<1 |[x|<1 
> sup |(Tx|Tx)| = sup ||Tx||? = ||T |. 
[<1 iIx|<1 


Finally we prove uniqueness. If U € (K,H) is bounded and (x|Uy) = (Tx|y) for 
allx € H andy € K, then Uy = T“y forall y € K, soU =T™. 


By definition we have 
(Txly) = (x|T"y) 
for all x,y € H. Symmetrically, we also have 


(T*xly) = (aITy). 


This can be seen by noting that (7*x|y) = (y|T*x) = (Ty|x) = (a|Ty). 
It is clear that * = / and for all T,U € @(H) and c € K we have 
(T+U)*=T*+U"*, (cT)* =cT*, 
and, as we have seen in the proof of Proposition 4.28, 
T*=T. 
Definition 4.29 (Hilbert space adjoint). The bounded operator T* is called the Hilbert 


space adjoint of T. 


The relation between the Hilbert space adjoint T* (which is a bounded operator on 
A) and the adjoint operator T* (which is a bounded operator on the dual space A*) is 
given by 


TWh = Wren 


as elements of H*, where yj, and r+; are the functionals in H* associated with h and 
T*h, respectively. Indeed, this follows from 


(x, 7° Wn) = (Lx, Wn) = (Lalh) = (xIT*H) = (2, Wren): 


Here, the brackets (-,-) denote the duality between H and its dual H*. 
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Example 4.30. Here are some examples of Hilbert space adjoints. They should be com- 
pared with Examples 4.24, 4.25, and 4.26, respectively. 


(i) As in Example 4.24, let A € (K", KK”) be represented as an (m x n) matrix with 
coefficients aj; € K. Viewing K” and K” as finite-dimensional Hilbert spaces, 
its Hilbert space adjoint A* may be represented as the (n x m) matrix with co- 
efficients @jj. Stated differently, the matrix associated with A* is the Hermitian 
transpose of the matrix associated with A. 

(ii) The Hilbert space adjoint of the kernel operator 7% on L7(0,1) of Example 4.25 
is the kernel operator T+ on L?(0, 1) given by 


1 bes Ss, 
Twe(t) = [ *(t,s)e(s)ds with k*(t,s) =K(s,8). 


(iii) The adjoint of the left (right) shift in (Z) is the right (left) shift. Similarly, the 
adjoint of the left (right) translation in L?(R) is the right (left) translation. 


For later reference we state a useful decomposition result. Versions for Banach spaces 
are given in Proposition 5.15 and Theorem 5.16. 


Proposition 4.31. [fT € 2(H,K) is a bounded operator, then H and K admit orthog- 
onal decompositions 


H=N(T)@R(T*), K=N(T*)@R(T). 
In particular, 


(1) T is injective if and only if T* has dense range; 
(2) T has dense range if and only if T* is injective. 


Proof If x  R(T*), then (Tx|y) = (x|T*y) = 0 for all y € K and therefore Tx = 0, so 
x € N(T). Conversely, if x € N(T), then (x|T*y) = (Tx|y) =0 for all y € K implies that 
x L R(T*) and hence x L R(T*). This proves the orthogonal decomposition for H. The 
decomposition for K follows from it applying it to T* and using that T** = T. 


4.4 The Hahn-Banach Separation Theorem 


In what follows, X is a Banach space. Corollary 4.12 can be interpreted as a separation 
theorem, in that it guarantees the existence of a functional separating a closed subspace 
from a given element not contained in it. The following result provides a far-reaching 
generalisation: 


Theorem 4.32 (Hahn—Banach separation theorem). Let C and D be disjoint nonempty 
convex sets in X, with C open. Then there exists an x* € X* such that the sets (C,x*) 
and (D,x*) are disjoint. 
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Proof We prove the theorem in three steps. 


Step 1 — First we prove the theorem for the real scalar field and D = {xo}. Replacing 
C and xo by C— yo and xo — yo for some fixed yo € C, we may assume without loss of 
generality that0 €C. 

Define the Minkowski functional of C as the mapping Ac : X — [0,c¢) given by 


Ac(x) := inf{t > 0: 1 !x € CH. 


Since C is convex, open, and contains 0, we have Ac(x) < 1 if and only if x € C. We 
claim that Ac enjoys the following two properties: 


(@) Ac(x+y) < Ac(x) + Ac(y) for all x,y € X; 
(ii) Ac(tx) = tAc(x) for all t > 0. 


To prove (i), fix € > 0 and let s,t > 0 be such that s~!x € C and t~!y EC, with s < 
Ac(x) +€ and t <Ac(x) +. Then 


(s+) '(@+y)= 


S ~] ; t ~] 
T 


Sx 
s+t s+t 


is a convex combination of the elements s~!x,t~!y € C and therefore belongs to C. It 
follows that 


Ac(xty) <s4+t <Ac(x) +Ac(y) +2e. 


Since € > 0 was arbitrary, this establishes (i). Assertion (ii) is obvious. 
We now apply Theorem 4.7 to the linear span W of xo and the linear mapping @ : 
W — R given by @(txo) :=t fort € R. In view of Ac(xo) > 1, for all t > 0 it satisfies 


Q(txo) =t < tAc(xo) = Ac(tx0). 


Hence we may apply the theorem and obtain a linear mapping x* : X — R extending @ 
which satisfies x*(x) < Ac(x) for all x € X. For all x € C it satisfies x*(x) < Ac(x) <1 
and for all x € —C it satisfies —x*(x) = x*(—x) < Ac(—x) = A_c(x) < 1. It follows 
that |x*(x)| < 1 for all x in the open set CM —C containing 0. This proves that x* € X*. 
Since (xo,x*) <Ac(x) < 1 for all x € C and (xo,x*) = @(xo) = 1, this functional has the 
required properties. 


Step 2 —In the case of complex scalars and D = {xo}, upon restricting scalar multipli- 
cation to the reals, Step | provides us with a real-linear mapping xp : X — R such that 
[Xp (x)| < 1 for all x € CN —C and xg (x0) ¢ xR (C). Then, as in the proof of Theorem 4.8, 
X* (x) = XQ (x) — xp (ix) is complex-linear and bounded, and satisfies x* (xo) ¢ x*(C) (by 
comparing real parts). 

Step 3 — Now we prove the general case. Fix arbitrary yo € C and zg € D, and put 
Xo := Zz — yo and C’ := C—D+ x9. Then C’ is open, convex, and contains 0. Also, 
xo ¢ C’, for x9 € C’ would imply 0 € C — D, which is impossible since CN D = @. By 
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Step 2 we obtain a functional x* € X* such that (x9,2*) ¢ (C—D+x0,x*), which is the 
same as saying that 0 ¢ (C — D,x*), or equivalently, (C,x*) M (D,x*) = 2. 


Corollary 4.33. Suppose (Xn)n>1 is a sequence in X and suppose that there exists an 
x € X such that 


dim (xn,x") =(x,x"), x EX" 
Then there exists a sequence (Yn)n>1 in the convex hull of (Xn)n>1 such that 


lim y, =x 
n—-yoo 


with convergence in norm. 


Proof Denote by D the closure of the convex hull of (x;)n>1. Our task is to prove that 
x € D. Suppose that this is not the case. Then Theorem 4.32 provides us with a functional 
x* € X* separating D from (a small enough open ball C around) x. This functional also 
separates (x; )n>1 from x, in contradiction to the assumptions of the corollary. 


4.5 The Krein—Milman Theorem 


Extreme points play an important role in many applications of Functional Analysis. For 
instance, in Quantum Mechanics pure states are the extreme points of the convex set of 
all states (see Chapter 15). 


Definition 4.34 (Extreme points). An extreme point of a convex subset C of a vector 
space is an element v € C such that if v= (1—A)vop +A V1 with vo,vi E Cand0 <A <1, 
then vp = vj =v. 


Stated differently, extreme points are points of C which cannot be realised in a non- 
trivial way as a convex combination of other points of C. 


Example 4.35. Let C denote the set of all probability measures on a given measure 
space (Q, .¥, 1). Viewing C as a closed convex subset of M(Q), the Banach space of K- 
valued measures on (Q,.¥), we claim that a probability measure pz is an extreme point 
of C if and only if y is atomic, that is, whenever A = Ag UA, with disjoint Ag,A; € F, 
then min{p (Ao), U(A1)} = 0. 

To prove the claim, suppose first that u € C and uw = (1—A) up FAM with Uo, bi EC 
and0 <A <1. If uo A M1, there is a set A € F such that uo(A) F f(A). Interchanging 
Mo and pl; if necessary, we may assume that 0 < Uo(A) < wi (A) < 1. Then (A) > 0 
implies (A) > 0, and Uo(CA) > 0 implies (CA) > 0, and therefore 1 is not atomic. 
This proves that every atomic measure is an extreme point of C. 
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Conversely, if 2 € C is not atomic, then there exists a set A € ¥ such that (A) > 0 
and (CA) > 0. Consider the probability measures Lo, HU) € C given by 


Ho(B) := (W(A))'M(BNA), 1 (B):= (u(CA))"'w(BNCA), Be F. 


With A := (CA) we have O <A <1 and w= (1—A)up +A, so pL is not extreme. 

As a special case, if K is a compact Hausdorff space, the extreme points of the set 
of all Borel probability measures on K are the Dirac measures supported on K. To see 
this, suppose that u is atomic and let S be its support, that is, $ is the complement of 
the union of all open sets of -measure zero. If S is not a singleton, then it contains two 
distinct points, say x and x;. Since K is Hausdorff, they are contained in disjoint open 
sets Up and U,. By the definition of support, (Uo) > 0 and u(U;) > 0, and p is not 
atomic. This proves that S is a singleton, say S = {x}, and therefore u = 6,. 


Closed convex sets need not have any extreme points: 
Example 4.36. In L!(0, 1), the closed convex set 


C={fEL'(0,1): f 20, [fll =1} 


has no extreme points. Indeed, let f € C. The mapping (6) := ||1(0,5) fl|1 is continuous 
from [0, 1] to [0, 1], and satisfies ¢(0) = 0 and @(1) = 1. Hence there exists 0 <5 <1 
such that @(6) = 5. Then f = sgt 5h, where g,h € C are given by g = 219.5) f and 
h=21(51)f, so f is not an extreme point of C. 


As an application of the Hahn—Banach theorem we can prove the following result 
about existence of extreme points. Recall that the (closed) convex hull of a subset S of a 
Banach space is the smallest (closed) convex set containing S. 


Theorem 4.37 (Krein—Milman). Every compact convex subset of a Banach space is the 
closed convex hull of its extreme points. 


Proof Let K be a compact convex subset of the Banach space X. We first prove the 
existence of extreme points of K, and then prove that K is the closed convex hull of its 
extreme points. 

A face in K is a nonempty closed convex subset F of K whose elements can only 
be realised as convex combinations of elements in F’, that is, whenever x € F satisfies 
x= (1—A)x9 +Ax, with xo,x) € K and0 <A <1, then xo,x; € F. We make two useful 
observations: 


(i) x € K is an extreme point of K if and only if {x} is a face of K; 
(ii) if F is a face of K and F’ is a face of F, then F’ is a face of K. 


Claim (i) is evident. For (ii), if x € F’ is given as x = (1—A)xp +Ax, with x9,41 © K 
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and 0 <A <1, then xo,x,; € F since x € F and F is a face of K. Then x0,x, € F’ since 
F’ is a face of F. 


Step 1 —Let .# denote the collection of faces in K. This collection is nonempty, for it 
contains K. We partially order .% by declaring K, < K if Ky C Kj. By the finite inter- 
section property (see Appendix C), any totally ordered subset 2% of % has nonempty 
intersection; the singletons consisting of elements in this intersection are upper bounds 
for . Hence we can apply Zorn’s lemma and obtain that .% has a maximal element, 
say F. We claim that F is a singleton, say F = {x}. By (i), this means that x is an extreme 
point of K. 

To prove the claim, assume the contrary and let xo,x; € F be two distinct points. By 
the Hahn—Banach theorem there exists an x* € X* such that Re(xo,x*) A Re(x1,x"*). Let 


Fo := {re F : Re(x,x*) = inf Re(y,x")}, 
ye 


Then Fo is a proper closed subset of F, which is nonempty since the compactness of 
F implies that the infimum is a minimum. If an element x € Fo can be represented as 
x= (1—A)xX +A" with x,x” € F and0 <A <1, then 


(1—A) Re(x',x*) +A Re(x",x*) = inf Re(y,x"). 
y' 


If x’ ¢ Fo, then Re(x’,x*) > infyer Re(y,x*). Since also Re(x”’,x*) > infyer Re(y,x*), the 
above inequality cannot hold. The same contradiction is reached if x” ¢ Fy. We conclude 
that x’,x” € Fo and Fp is a face of F. 

Now (ii) implies that Fo is a face of K. Since Fo > F, this contradicts the maximality 
of F. This completes the proof of the claim that F is a singleton. It follows that K has 
an extreme point. 


Step 2 — Let L denote the closed convex hull of all extreme points of K. We wish to 
show that L = K. Reasoning by contradiction, suppose that L is a proper subset of K 
and fix an element xo € K \ L. By the Hahn—Banach separation theorem (Theorem 4.32) 
there exists an x* € X* such that (xo0,x*) ¢ (L,x*). Multiplying x* with an appropriate 
scalar if necessary, we may assume that 


Re(xo,.x*) < inf Re(y,x*). 
yeL 
As in Step | we see that the set 


F:= {re K : Re(x,x*) = inf Re(y,.*) } 
yekK 


is a nonempty face of K. Moreover, Step 1 applied to F shows that F has an extreme 
point x;. By (i) and (ii), x; is also an extreme point of K. On the other hand from 


Re(x1,x") = inf Re(y,x") < Re(x,x") < inf Re(y, x") 
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we infer that x; ¢ L. Since L contains all extreme points of K we have arrived at a 
contradiction. 


4.6 The Weak and Weak* Topologies 


Some of the more advanced applications of the Hahn—Banach theorem can be conve- 
niently formulated in terms of certain topologies generated by bounded functionals. 
The two most important ones are the weak topology of a Banach space and the weak* 
topology of its dual. 


4.6.a Definition and Elementary Properties 


Definition 4.38 (Weak topologies). Let V and W be vector spaces and let B: V x W > K 
be a bilinear mapping. The weak topology of V generated by W is the smallest topology 
ton V with the property that the linear mapping v+> B(v,w) is continuous for all w € W. 


This topology is obtained as the intersection of all topologies in V for which all 
linear mappings v++ B(v,w), w € W, are continuous. The family of topologies with this 
property is nonempty, for it always contains the power set topology of V. 

By necessity, the weak topology T must contain every set of the form 


Uvy.wo,e = tv EV: |B(v—vo,wo)| < e€}, 


noting that this set is the inverse image under the continuous mapping v +> B(v,wo) of 
the open ball B(B (vo, wo);€) in K. 

We claim that t coincides with the topology t’ generated by the sets Uiywo,e, where 
vo, Wo, and € range over V, W, and (0,0) respectively. The observation just made im- 
plies that t’ C t. In the opposite direction, for every w € W the inverse under B(-,w) of 
every open ball belongs to 7’, so every B(-,w) is continuous with respect to 7’. Since T 
is the smallest topology with this property we have T C 7’. This establishes the claim. 

It follows from the claim that a set U C V belongs to 7 if and only if it can be written 
as a union of finite intersections of sets of the form U,, y,,e. Indeed, the collection Gi 
of sets that can be written this way is a topology which contains every set Uy wo,e, and 
therefore we have t’ C 7”. By the preceding observation, this means that tT C t”. In the 
converse direction, the fact that topologies are closed under taking unions and finite 
intersections implies that every set in 7” belongs to T. 


Proposition 4.39. In the above setting, a sequence (Vp_)n>1 Converges to v with respect 
to T if and only if 
lim B(v,—v,w) =0, wew. 


n—oo 
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Proof The ‘only if? part follows from the fact that if v, — v with respect to T, then for 
all € > 0 and w € W we have v, € U, we for all large enough n. For the ‘if’ part we note 
that if U € t contains v, then the observation preceding the statement of the proposition 
allows us to find U1) ,,1) g(t) +--+» U,@ wie) e@) Such that 


k 
xE a Uy.) (el) CU. 
j=l 


Since we assume that B (vy — v,w“/)) > 0 for j = 1,...,k, for large enough n we have 
k 
Ya € (Vja1 Uy) yd) et) and hence v, € U. 


The duality between a Banach space and its dual leads to two special cases of interest: 
Definition 4.40 (The weak and weak* topologies). Let X be a Banach space. 


(i) The weak topology of X is the topology induced by X*. 
Gi) The weak* topology of X* is the topology induced by X. 


It is implicit that in (1) we use the bilinear mapping from X x X* to K given by 
(x,x*) + (x,x*); in (ii) we use the bilinear mapping from X* x X to K given by (x*,x) 
(x,x*). For these topologies, Proposition 4.39 takes the following form: 


Corollary 4.41. Let X be a Banach space. The following assertions hold: 


(1) a sequence (Xn)n>1 in X converges to x € X with respect to the weak topology of X 
if and only if lity yoo (Xn — X,x*) = 0 for all x* € X*; 

(2) a sequence (x;)n>1 in X* converges to x* € X* with respect to the weak* topology 
of X* if and only if limy—so0(x,x% — x*) = 0 for all x € X. 


Convergence with respect to the weak and weak* topologies will be referred to as 
weak convergence and weak* convergence, respectively. The following result is imme- 
diate from the Hahn—Banach separation theorem: 


Proposition 4.42 (Closed convex sets are weakly closed). Every closed convex set in a 
Banach space is closed in the weak topology. 


The next result characterises functionals that are continuous with respect to the weak 
and weak” topologies. 


Proposition 4.43. Let X be a Banach space. The following assertions hold: 


(1) alinear mapping 6 : X — K is continuous with respect to the weak topology if and 
only if it belongs to X*; 

(2) a linear mapping 6 : X* — K is continuous with respect to the weak* topology if 
and only if it belongs to X, that is, there exists an x € X such that o(x*) = (x,x*) 
for all x* € X*. 
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Proof (1): By assumption, @~!(Bx) contains a weakly open set containing the origin. 
Since weakly open sets are open, this set contains a ball B(0;r) with r > 0. This means 
that @ € X* and ||@|| < 1/r. 


(2): By assumption, 6~!(Bx) contains a weak* open set U containing the origin. 
Since weak* open sets are weakly open, part (1) shows that @ € X**. 
The set U contains a set of the form 


U' :={x* ex": [(xj,x")|<e€, jol,...,k} 


for suitable € > 0, k > 1, and x),...,x, € X. Let Xo denote the span of x1,...,x,. This 
space is finite-dimensional and therefore closed, and by Proposition 4.16 it is comple- 
mented. The proof of this proposition shows that Xp is the range of a projection 7. Let 
7% := I — NM be the complementary projection and denote by X; its range. 

Viewing 7% as a bounded operator from X onto Xo, its second adjoint 74* is a bounded 
operator from X** to X;*. The space Xo, being finite-dimensional, can be identified with 
its second dual, the identification being given by the natural inclusion mapping J : Xj) > 
Xo which is surjective in this case (the details are spelled out in Example 4.57). Thus 
we may identify 7*@ =: x with an element of Xo. We will show that @(*) = (x,x*) for 
all x* € X*. 

Let X* = R(m}) @ R(z7) be the direct sum decomposition associated with the adjoint 
projections 75 and x}. If xj € R(ay7), then (x;,x7) =0 for all j = 1,...,k, so (x,x7) =0. 
This implies that cxf € U! C o~ | (Bx) for all c € K, that is, @(cx}) € Bx for all c € K, 
and this is only possible if @(x}) = 0. For all x* = x9 +x} € R(aj) @ R(ay) = X* we 
thus obtain 


(xe) = (x,x0) = (x0, 9) = (1x0, 9) (x0, 0) (x",@) d(x"). 


We apply the second part of this proposition to prove a version of the Hahn—Banach 
separation theorem for the weak* topology. 


Proposition 4.44. If F is a weak* closed convex subset of X* and xj ¢ F, then there 
exists an element xo € X such that (xo,x*) € (xo, F). 


Proof Suppose first that K = R. By definition of the weak* topology there exists a 
weak” open set U of the form 


U = {x* €X* : |(xj,x*)| <€, j=1,...,k} 


for suitable x;,...,xz € X and € > 0, such that (xj + U) NF = ©. By the Hahn—Banach 
separation theorem there exists an element x5* € X** separating x9 + U from F. This 
forces xj° to be bounded on U, for otherwise the convexity and symmetry of U implies 
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that (U,x") = R and then the set (x§ + U,x¢*) would contain the set (F,.x5), contracting 
the choice of x4". 

Since x9" is bounded on U, xj* is continuous with respect to the weak” topology of 
X. Hence by Proposition 4.43 it can be identified with an element x9 € X. This element 
has the desired properties. 

This concludes the proof in the case K = R. If K = C we apply the result for real 
scalars to the real Banach space Xg obtained by restricting scalar multiplication to real 
scalars. 


Using the notation introduced in Definition 4.17 we have the following characterisa- 
tion of weak and weak* closures of subspaces. 


Proposition 4.45. Let X be a Banach space. The following assertions hold: 


yweak 


(1) for every subspace Y of X we have +(¥+) =Y =: 
(2) for every subspace Y of X* we have (tY)+ = eo, 


weak 


Proof (1): The inclusion +(Y+) > Y 
annihilators are weakly closed, and the equality yr —Visa consequence of Propo- 
sition 4.42. To prove the inclusion +(Y+) C Y, let x ¢ Y. By Corollary 4.12 we can find 
x9 € Y+ such that (x,x}) 4 0. This means that x is not in the pre-annihilator of Y*+. 


follows from the easy observation that pre- 


(2): This is proved in the same way, except that now we use Proposition 4.44. 


We recall that the canonical embedding J : X — X* is defined by (x*, Jx) := (x,x*). 
As we have seen before, J is an isometry from X into X**. Accordingly we may identify 
X with a closed subspace of X** and By with a closed subset of By». As a further 
application of Proposition 4.44 we have the following density result. 


Theorem 4.46 (Goldstine). Let X be a Banach space. The following assertions hold: 


(1) X is weak* dense in X**; 
(2) By is weak* dense in By. 


Proof It suffices to prove the second assertion; the first follows from it by normalising 
elements x € X to unit length. Arguing by contradiction, suppose that x§* € By-» is 
not contained in the weak* closure F of By. Then by Proposition 4.44 there exists an 
element xj € X* such that (x9,x9°) ¢ (x9, F). By multiplying with a scalar may assume 
that ||x6|| = 1. Since By» is weak* closed we have By C F C By-«. It follows from this 
that (x§,F) =Bx = {c€K: |c| < 1}, and therefore | (x},.x§*)| > 1. But this contradicts 
the fact that ||x9*|| < 1 and ||x6|| = 1. 


Closely related (cf. Problem 4.35) is the following result. 
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Theorem 4.47. For all x** ¢ X™, x},...,xy € X*, and € > 0 there exists anx € X such 
that ||x|| < ||x**|| + € and 


(x,%,) = (Wx), n=1,...,N. 
The proof uses an elementary version of the open mapping theorem (Theorem 5.8): 


Lemma 4.48. Let T be a bounded operator from a normed space X onto a finite- 
dimensional normed space Y. Then T maps open sets to open sets. 


Proof Let (yn)4_, be a basis for Y and choose a sequence (x,)4_, in X such that 
Txn = yn for n = 1,...,d. Let Xo denote the linear span of Cane The restriction 
To := T |x, : Xo + Y is bounded and bijective. By Corollary 1.37, its inverse is bounded. 
This implies that 7p maps open sets to open sets. 

Now let U be open in X. For every u € U let r, > 0 be such that By, (u;r,) =u+ 
ryBx, CU. Then U = U,cy (u+ruBx,) and therefore 


T(U) = (J (Tut rT (Bxy)) 
ucU 


is open since T (Bx, ) = To(Bx,) is open. 


Proof of Theorem 4.47 We prove the theorem in two steps. 


Step I — Let xj,...,xy € X* and cj,...,cy € K be given. In this step we prove that if 
there exists a constant M > 0 such that for all A,,...,Ax” € K we have 


n=1 


then there exists an x € X such that ||x|| <M-+ e and 


(4.7) 


N 
<M Yana 
n=1 


(x,%) =Cn, n=1,...,N. 


Consider the mapping T : x ++ ((x,x*))_, from X into K%. We must prove that 
T(B(0;M + €)), which by Lemma 4.48 is an open subset of the finite-dimensional 
space R(T’), contains (c,)*_,. Suppose, for a contradiction, that this not true. Then 
T (B(0;M +€)) is an open subset of K” not containing (c,)_,. The Hahn—Banach sep- 
aration theorem provides us with a sequence (A,)_, € K% such that 


acs g { (x, Yat) > |lxl| <M+eh. 


Multiplying with an appropriate scalar of modulus one, it follows that also 


[3 nce] { (1s Masa) ¢ Ul <M re}. 
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By scaling, the right-hand side set contains the interval [0,M|| £*_, Anx*||]. This contra- 
dicts the assumption (4.7). 


Step 2 — Returning to the assumptions of the lemma, fix x** € X** and x},... xy € X% 
and set Cn := (x*,x"*) forn =1,...,N. For all A1,...,An € K we have 


N N 
Pe =|(L desis ) 


so the assumptions of Step 1 are satisfied with M = ||x**||. 


4x 
Ilx""Tl, 


N 
< | aes 


4.7 The Banach-Alaoglu Theorem 


We have seen in Chapter | that the closed unit ball of a Banach space is compact if and 
only if the space is finite-dimensional. In this section we prove that the closed unit ball 
of every dual Banach space is compact with respect to the weak* topology, and use this 
to prove that a Banach space is reflexive if and only if its closed unit ball is compact 
with respect to the weak topology. 


4.7.a The Theorem 


In preparation for the main result of this section, Theorem 4.51, we have the following 
simple result. It extends Proposition 3.16 to separable Banach spaces. 


Proposition 4.49. Let X be a separable Banach space and let (x;)n>1 be a bounded 
sequence in the dual space X*. Then there exists a subsequence (Xp, )k>1 and an x* € X* 
such that 


lim (x,%;,) = (%,x"), x EX, 
—}oo 


Proof Let (x;)j>1 be a countable set with dense linear span Xo in X. By a diagonal 
argument, there exists a subsequence (x, )k>1 Of (x;)n>1 Such that the limit @ (xj) := 
lim. 00(%,%p,) exists for all j > 1. Then, by linearity, the limit @ (x) := limy.0(x,%},) 
exists for all x € Xo. Clearly x ++ (x) is linear, and from |@(x)| < sup;s; ||~%, || we see 
that @ is bounded as a mapping from Xo to K. Since Xp is dense in X, Proposition 1.18 
implies that @ has a unique bounded extension of the same norm to all of X. Denoting 
this extension also by @, it follows from Proposition 1.19 that limy_,..(x,x*,) = @(x) for 


Nk 
all x € X. Thus x* := @ has the required properties. 


In contrast to the Hilbert space case considered in Proposition 3.16, the separability 
assumption in Proposition 4.49 cannot be omitted: 
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Example 4.50. Consider the sequence (@,)n>1 of coordinate functionals x +> x, on 
X = (*. Given a subsequence (gy, )x>1, let x € €* be any element such that x, = (—1)* 
for all k > 1. Then (x, @,) = (—1)* fails to converge as k — 0, 


Proposition 4.49 can be viewed as a sequential version of the following theorem. 


Theorem 4.51 (Banach—Alaoglu). The closed unit ball of every dual Banach space is 
compact with respect to the weak* topology. 


Proof Let Bx ={a€K: |a| < 1} and By« = {x* € X*: ||x*|| < 1}, where X is a given 
Banach space. In the remainder of the proof we think of By+ as endowed with the weak* 
topology inherited from X*. 

By Tychonov’s theorem (Theorem C.13), the product space 


K:= Il \|x|| - By 
xEX 
is compact with respect to the product topology. Denoting elements of K as k = (kx) xex, 
consider the mapping from @ : Bys — K given by 


@(x") == ((x,x") nex. (4.8) 


Let R denote its range. We first prove that R is a closed subset of K. To this end let rE R. 
As a first step we show that r is linear in the sense that doryy + 11x, = Tapxp+a,x, for all 
ao, a, € K and xo,x; € X. Fix an arbitrary € > 0. By the definitions of the weak* and 
product topologies, the set 


US {ke K: kx —Trxo| <€, kx, —Tx,| <€, [Kagxo+ayx) = Fagxo+a1x,| < e} 
is open in K and contains r, and therefore U intersects R. This means that there is an 
Xo € Bx» such that ((x,x4))xex € U, that is, 

|(x0,x0) _ xo | <€, (1,0) Tx, | <€, |(aoxo + 41X1,X9) — Fagxo+aix1 | <E. 


Then, 


|a0 ry + 1 Fx, — Tagxotaixs | < |@or'x9 — 40(%0,%9)| + lair, — a1 (%1,X0)| 
+ |a1 (X0,%9) — 41(%15%0) — Taoxo tai | 
<laole + [bole +e. 
Since € > 0 was arbitrary, this proves the linearity of r. 
Next we prove that r = ((x,x*)),ex for some x* € X* which is the same as saying 
that r € R. Since we already know that r is linear, we must prove that r is bounded in 


the sense that |r,| < C||x|| for all x € X. To this end let x) € X and € > 0. Arguing as 
before, from |(x0,x6) — rxo| < € we infer 


[rx] S [(%0,0)| +€ < |[x0ll +€. 
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Since x9 € X and € > 0 were arbitrary, it follows that r is bounded of norm at most | 
and thus defines an element x* € Bx« in the sense that ry = (x,x*) for all x € X. This 
completes the proof that R is closed. As a consequence, R is compact, it being a closed 
subset of the compact set K. 

Since the mapping @ defined by (4.8) is injective, the inverse mapping @~! : R> By« 
is well defined. We claim that this mapping is continuous. Since the sets Vy x9. NBy« 
generate the weak* topology of By~ it suffices to check that their images under @ are 
open in R. But these are the sets {k € K : |ky. — (x0,x9)| < €} OR, which are open in R. 

The weak* compactness of By« = ~!(R) now follows from the compactness of R. 


A topological space T is said to be metrisable if the underlying set admits a metric 
whose open sets are precisely the sets of t. The next result shows that if X is separable, 
then the weak* topology of the unit ball of X* is metrisable. As a result, Proposition 
4.49 can alternatively be deduced from Theorem 4.51 by using that compactness and 
sequential compactness are equivalent for metric spaces. 


Proposition 4.52. If X is a separable Banach space, then the weak* topology of the 
closed unit ball of X* is metrisable. 


Proof Let (x,)n>1 be a dense sequence in the closed unit ball By of X. Such a sequence 
exists since X is separable. It is easily checked that the formula 


1 |@n,x*—y*)| 
d(x*,y*) = et 
Lo 1+|(%n,x* —y*)| 


defines a metric d on By« and that the identity mapping Iy- : (By«, weak*) + (By-,d) 
is continuous. In particular [ys maps compact subsets of (By+,weak*) to compact sub- 
sets of (By«,d). The Banach—Alaoglu theorem asserts that (By«,weak*) is compact. 
Since closed subsets of a compact space are compact and compact sets are closed, [y+ 
maps closed subsets of (By«, weak*) to closed subsets of (By«,d). Thus the continuous 
mapping Jy» has continuous inverse and the result follows. 


4.7.b Reflexivity 


Recall that if X is a Banach space, we can use the natural isometry J : X — X** given 
by (x*,Jx) := (x,x*) to identify X with a closed subspace of X**. 


Definition 4.53 (Reflexivity). A Banach space X is called reflexive if the mapping J : 
X — X™ is surjective. 


The Banach—Alaoglu theorem implies the following characterisation of reflexivity. 
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Theorem 4.54 (Reflexivity and weak compactness of the unit ball). A Banach space is 
reflexive if and only if its closed unit ball is weakly compact. 


Proof The ‘only if’ part follows from the Banach—Alaoglu theorem, noting that the 
canonical embedding J : X — X** maps By onto By«« and that, under the identification 
of X and X**, the weak topology of X equals the weak* topology of X**. 

The ‘if’ part follows from Goldstine’s theorem (Theorem 4.46). Indeed, if By is 
weakly compact, then its image under the canonical embedding J is weak* compact in 
X**: this follows from the observation that J is continuous as a mapping from (X, weak) 
to (X**, weak”), which in turn is a trivial consequence of the definitions of the weak and 
weak* topologies. It follows that By is weak* compact as a subset of the closed unit ball 
By». On the other hand, Goldstine’s theorem says that By is weak* dense as a subset of 
By. Hence we must have By = By, and this implies X = X**. 


Corollary 4.55. Let X be a Banach space. The following assertions hold: 


(1) ifX is reflexive, then every closed subspace of X is reflexive; 

(2) X is reflexive if and only if X* is reflexive; 

(3) if X is isomorphic to a Banach space Y, then X is reflexive if and only if Y is 
reflexive. 


Proof (1): If Y is aclosed subspace of X, the closed unit ball By is the intersection of 
the set By, which is weakly compact by Theorem 4.54, and the set Y, which is weakly 
closed by Corollary 4.12. As a result, By is weakly compact, and Y is reflexive by 
another application of Theorem 4.54. 


(2): If X is reflexive, the weak* and weak topologies of X* coincide. As a result, By« 
is weakly compact by the Banach—Alaoglu theorem, and therefore X* is reflexive by 
Theorem 4.54. If X* is reflexive, then X** is reflexive by what we just proved, and then 
X, viewed as a closed subspace of X**, is reflexive by part (1). 

(3): Ifi:X — Y is an isomorphism, then i** : X** — Y** is an isomorphism by 
Proposition 4.27 applied twice. Denoting the natural isometries of X and Y into their 
second duals by Jy and Jy, one easily checks that i** oJy = Jy oi™. This identity implies 
that Jy is surjective if an only if Jy is surjective. 


Part (3) can be also be deduced from Theorem 4.54; we leave this as an easy exercise. 


Corollary 4.56. Every bounded sequence in a reflexive Banach space has a weakly 
convergent subsequence. 


Proof Let (%n)n>1 be a bounded sequence in the reflexive Banach space X and let Y be 
its closed span. Then Y is separable, and Y is reflexive by Corollary 4.55. Proceeding as 
in the proof of Proposition 4.52, the weakly compact set By is metrisable and we may 
argue by sequential compactness of the resulting metric space. 
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Example 4.57. The following are examples of reflexive Banach spaces: 


e finite-dimensional Banach spaces; 
e the spaces ¢? with 1 < p< 0%; 

e the spaces L?(Q) with 1 < p < 0%; 
Hilbert spaces. 


To prove that finite-dimensional Banach spaces X are reflexive we use the fact that 
such spaces are isomorphic to K“, where d = dim(X). The isomorphism K¢ ~ (IK“)* 
dualises to an isomorphism (K“)* ~ (IK¢)**, and one easily checks that the composition 
of these isomorphisms equals the canonical embedding J : K? > (K¢)*. In particular, 
this embedding is surjective. It follows that K? is reflexive, and therefore X is reflexive 
by Corollary 4.55. 

In the same way the second and third examples follow from the isometric identifica- 
tions (€?)* = 07 and (L?(Q))* = L1(Q), ; + 7 = 1. Strictly speaking we have shown the 
identification (L?(Q))* = L4(Q) only for o-finite measure spaces, but for 1 < p < 
the o-finiteness assumption is redundant (see Problem 4.3). 

That Hilbert spaces are reflexive is a consequence of the Riesz representation the- 
orem (Theorem 3.15) which sets up a conjugate-linear identification of the dual H* 
of a Hilbert space with the Hilbert space H itself. Applying the theorem twice and 
composing the identifications of H with H* and H* with H**, and again the resulting 
identification H with H** equals the natural embedding J : H — H™*. 


Example 4.58. The spaces co, ¢ ! and & are nonreflexive: for co this follows from the 
fact that cé* = ¢° and for ¢! = cé and (” = (¢!)* this follows from Corollary 4.55. 
The spaces C(K) and L!(Q) are nonreflexive except when they are finite-dimensional. 
Indeed, it is easy to find closed subspaces isomorphic to co or ¢!, respectively (take the 
closed linear span of any sequence of norm one vectors with disjoint supports), and 
nonreflexivity again follows from Corollary 4.55. 


4.7.c Translation Invariant Operators on L!(R“) 


In this section and the next we give two nontrivial applications of the Banach—Alaoglu 
theorem or, more precisely, its sequential version contained in Proposition 4.49. The 
first application is concerned with characterising the translation invariant operators on 
L'(R¢) as the convolutions with finite Borel measures. It is complemented by Theo- 
rem 5.35 in the next chapter, which characterises the translation invariant operators on 
LR“) as the Fourier multiplier operators. 


—> 


Lemma 4.59. If g € L'(R“) satisfies fea g(x)o(x) dx = 0 for all 6 € C?(D), then g =0 
almost everywhere on D. 
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Proof Without loss of generality we may assume that g is nonnegative. 
For any open set U CR? we may choose functions @, € C2(U) such that 0 < @, t lu 
pointwise as n —> co. Then, by the assumption of the lemma and dominated convergence, 


[ e(s)ax= fim [., 9(0)4n(2) dx = 0. 


n> JIRd 


Dynkin’s lemma (Lemma E.4), applied to the finite Borel measure v(B) := fp g(x) dx, 
shows that f, g(x) dx = 0 for all Borel subsets B C R@ Taking B = {x € R@: g(x) > e} 
gives 0 < g < € almost everywhere. This being true for all € > 0 gives the result. 


Theorem 4.60 (Translation invariant operators on L'(IR¢)). If T is a bounded operator 
on L} (R?) commuting with every translation, then T is the convolution with respect to 
a (necessarily unique) measure  € M(R®), that is, for all f € L'(R¢) we have 


Tr) =f fe-y) dub) 
for almost all x € R“ Moreover, |\T|| < |||. 


Proof Fix a function n € L!(R®) satisfying fpa7(x) dx = 1. For € > 0 the mollified 
functions Ne (x) := e~4n(e~!x) belong to L!(R?) and satisfy ||ne||1 = || ||1. By view- 
ing the functions Tn- € L'(R¢) as densities of finite Borel measures on R? we may 
identify them with finite Borel measures Ue € M(IR“). By Theorem 4.2 and Proposition 
E.16, M(R®) can be identified with the dual of Co(R“). Hence by the sequential ver- 
sion of the Banach—Alaoglu theorem (Proposition 4.49), some subsequence (T1]¢, )n>1 
converges weak* to a measure pt € M(IR“). Then, for all g € Co(R*), 
lim [| 80)Tna(v) dy = [ e) 40). 


noo J Rd 


Applying this to the functions y++ g(x+y) = (txg)(y), where 7, is translation over x, 
upon letting n — 0 we obtain 


[sorTne-vdy= [ gx+y)Tn ody [ ele+y)auty). 
R R R 


By Fubini’s theorem, a change of variables, the commutation assumption, and Proposi- 
tion 2.34, this implies 


[8 [,fe—naworar= [ [ re—v)e(a) aranty) 


=] f(x) ] g(xt+y)du(y) dx 
Rd Rd 


= lim | f(x) | 89) T Ne, (y — x) dy dc 
R Ré 


n—oo 


=lim | f(x) | 8(y)txT Ne, (y) dy dex 
R R4 


n—oo 
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lim [F(0) [., 80) Ene, (0) dye 


nooo Rd 
Phim [ T'80) [., Fen, (yx) deay 
noo JIRd Rd 
=[ reosorer= [i soyrroyey. 
R R 


In (*) we identified g € Co(R“) with a function in L*(R“) = (L'(R4))* 

Since the above identities hold for all g € Co(R“), Lemma 4.59 implies that fa f(x— 
y)du(y) = 7 f(y) for almost all y € R@ This proves that T is of the asserted form, and 
the bound ||7'|| < ||1|| follows from the observation preceding the theorem. 

It remains to establish uniqueness of the measure 11. Suppose that 1 € M(IR“) satisfies 
f*xu =0 for all f € L'(R¢). Fix a function f € C.(R“). Then 


(Feny(a) =f af(-»)du(y) =0 


for almost all x € R@ Since x + 1,f is continuous from R¢ to Co(R®) it follows that the 
preceding equality holds for all x € R@ Since the choice of f € C.(IR“) was arbitrary, 
we conclude that 


Py) dy) = 0 
R 


for all 6 € C.(R“). Since C,(IR“) is dense in Co(R“), this implies u = 0. 


4.7.d Prokhorov’s Theorem 


The aim of this section is to prove a compactness result of fundamental importance 
in Probability Theory, known as Prokhorov’s theorem. For its statement we need the 
following terminology. 


Definition 4.61 (Uniform tightness). A collection P of Borel measures on a topological 
space X is called uniformly tight if for every € > O there exists a compact set K in X 
such that u(X \ K) < € for all p € P. 


Definition 4.62 (Weak convergence). A sequence (Uy)n>1 of Borel probability mea- 
sures on a topological space X is said to converge weakly to a Borel probability measure 
yon X if 

tim [| fatin = f fan, fe Col), 

no JX Xx 


where Cy (X) denotes the Banach space of bounded continuous functions on X. 


Viewing Borel probability measures on X as functionals in the dual space (C,(X))*, 
weak convergence in the sense of the above definition is precisely weak* convergence 
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in the sense discussed in the present chapter. The terminology ‘weak convergence’ is 
firmly established in the Probability literature, however. 


Theorem 4.63 (Prokhorov’s theorem). For a metric space X, the following assertions 
hold: 


(1) ifa family P of Borel probability measures on X is uniformly tight, then every se- 
quence in P contains a weakly convergent subsequence; 

(2) if X is separable and complete, then every weakly convergent sequence of Borel 
probability measures on X is uniformly tight. 


The proof of part (1) relies on the following lemma. 


Lemma 4.64. Let (Q, 7) be a measurable space and let Ay C Az C ... be an increasing 
sequence of sets in F such that Ujs1 Aj = @. For each j 2 1 let p; she a measure on 
Aj. If the measures |; are increasing in the sense that pj|4, > bi whenever j > i, then 
the formula 


1 (B) := lim 41j(BNA)) 


jee 
defines a measure on Q. 
Proof If B € F, then for j > i we have uj(BNA;) > wj(BNAi) = Ujla,(BNAi) > 
Hi(BOA;). Therefore the limit defining (B) exists, and u(B) = sup; Mj(BNAj). 
It is clear that 4(@) = 0. To prove that ps is countably additive, let B = U5, By with 
disjoint measurable sets B,,. On the one hand, 


H(B) = suppj(BNA)) ) =sup 7 Hj(Bn Aj) 


j21 J2Z1n>1 
< Yi supyj(Bn Aj) = Yu (Bn) 
n>1j21 n>1 


On the other hand, for each jp > 1 we have 


M(B) = supij(BNA)) > My (BOA) = YE Hip (Bn A jy): 
JZ n>1 


Hence, by monotone convergence, 


H(B) > Tim J) Mip(BuAj) = YY lim Mjp(Bn Ajo) = YY (Bn): 


~ Jo ee n>1 n>1 n>1 


Proof of Theorem 4.63, part (1) Assuming that P is uniformly tight, we must prove 
that there is a Borel probability measure on X such that U, + pL weakly. 

Choose an increasing sequence of soe sets K; C X such that f,(Kj) > 1—1/2/ 
for all j 2 1 and n > 1. Replacing X by Uj) Kj, we may assume that X is separable. 
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Step 1 — Identifying each restriction U,|x; with an element of (C(K;))*, by a diag- 
onal argument we find a subsequence (Mp, )¢>1 Such that for all j > 1 the sequence 
(Un, |K; )k>1 is weak* convergent in (C(K;))* to some Borel measure vj; on K;; this ar- 
gument uses the sequential version of the Banach—Alaoglu theorem (Proposition 4.49) 
and the separability of the spaces C(K;) (Proposition 2.8). Note that 


v;(Kj) = (1x;,Vj) = Tim (1; My |x;) = fim Hn, (Kj) > 1-2-4 


where the brackets refer to the duality of C(K;) and its dual M(K;). 

Step 2 — We claim that if j > i, then Valk; > v;. To this end fix a number € > 0 
and a function f € C(K;) satisfying 0 < f(x) < 1 for all x € K;. Using Theorem C.12 we 
extend f to a function in C(K;) satisfying 0 < f(x) < 1 for all x € Kj, and let fi, € C(K;) 
be defined by fin(x) := (1 —m-d(x,K;j))* f(x), x € K;. Choosing m large enough, say 
m > Me, we may assume that 


| Sin Vj <€. 
Kj\K; 


Since fin = f on Kj we find 


| tavi- | favi> -e+ | indv;— f fn dV; 
K; Kj Kj Kj 
=—€+ lim (/ fin DL -f Fn Gbln,) 2-E. 
Kj K; 


k00 
ere 
>0 


Since € > 0 was arbitrary, this proves that x, fdv; > x, fdvi- 
Step 3 — We apply Lemma 4.64 to see that 


L(B) = lim vj(BN Kj) = up v;(BN Kj) 
Jue jz 
defines a Borel measure pt on Xo := Uj; Kj. We may extend yt to a Borel measure on 


all of X by extending it identically 0 outside Xo. Clearly, u(X) < 1 and 
U(X) > m(Kj) > vi(Kj) > 1-2. 


This proves a couple of things at the same time, namely that u is a probability measure 
and that p is tight. 

It remains to prove that limy—... Un, = H weakly. For this, it suffices to prove that 
limy seo fy f dun, = Jy fdu for all f € Cy(X) satisfying 0 < f < 1. Fixing such a func- 
tion, choose a sequence (f)m>1 of simple functions satisfying 0 < fin <1 for all m > 1 
and fin —> f uniformly as m — oe. Fix jo so large that 2~/0 < €. Then, for m > me (with 
me as in Step 2) and all j > jo, 


limsup| [fd — [fF dyn 


k-00 


<2! +limsup| [ fau— f fbn, 
Kj Kj 


k- 00 
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<2e+| f fa — | fav,| 
Kj Kj 
<4e+| [ fau— | fav,| 
Kjy Kjy 
<6e+| [ Indu | fn vj}. 
Kjy Kjy 


Since limj4..Vj(BN Kj,) = limj.Vj(BN Kj, Kj) = u(BNK;,) for all Borel sets B 
in X and each function f,, is simple, upon letting j — oo we obtain 


lim he dv; = Ie Sn du. 


joe 


Consequently, 


< 


limsup| [fay — [Fay 


k—yoo 


Since € > 0 was arbitrary this proves the weak convergence. 


For the proof of part (2) we need the following characterisation of weak convergence, 
known as the Portmanteau theorem. 


Proposition 4.65. Let U,, n > 1, and w be Borel probability measures on a metric space 
X. The following assertions are equivalent: 


(1) lime Un = UL weakly; 
(2) for all open subsets U of X we have u(U) < liminfy..0 Un(U); 
(3) for all closed subsets F of X we have limsup,,_,..Mn(F’) < U(F). 


Proof (1)=(2): Let U CX be open. For each k > 1 let 
1 
uy) s— {re U: d(x,CU) > x}. 


Since U is open we have U = Us U and w(U) = limy,.. u(U™). 
The functions 


f(x) = min{1,kd(x,CU)}, xeEX, 


belong to Cy(X) and satisfy 0 < f, <1, fe =0 on CU, and f, = 1 on ul), 
For each k > 1, 


wu) < a fcdu = lim | paige i, fi dtly <liminfy,(U). 
xX noo Jy no Jy n—oo 


Now we pass to the limit k — 9, 


The equivalence (2)++(3) follows by taking complements. 
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It remains to prove that (2) and (3) together imply (1). Let U;,...,U, be open sets in 
X and consider a function of the form g = ya Cc jlu;- We have, using (2), 


k k 
[sau = Leon) < Ha ci) - limint [edn 


Similarly, for g = Yin Cj ly, we have, using (3), 


k k 
limsup | gduln < ye limsup [n(U;) < he L(U;) = | eau. 


noo JX j=l n—oo 


Let f € C,(X) be a real-valued function and choose a,b € R such that a < f(x) < b for 
all x € X. There are at most countably many r € (a,b) such that the set {x € X: f(x) =r} 
has nonzero U-measure. Let R denote the set of these numbers r. Fix € > 0 and let 
m = {to,...,t} be a partition of [a,b] with mesh(z) < € such that t = a, t% = b, and 
t; £ R for all j=1,...,k—1. PutU;:= {xe X: f(x) € (t)-1,t))}, j= 1,...,&, and 


k k 
g:= Yi tj-11u,;, h:= Y tily;- 
j=l j=l 


With the above notation, g(x) < f(x) < e+ (x) and f(x) < h(x) < e+ f(x) whenever 
f(x) At; for j=1,...,.k-—1. Since p{x EX: f(x) =t;} =0 for j=1,...,k—1, we 
obtain 


lim sup Fen <limsup | hduy < [hay < e+ | fan 
nc JX x xX 


n—-oo 
< 2e+ gdu< 2e +limint / edly < 2e +limint / Sf dun- 
xX neo JX neo JX 


Since € > 0 was arbitrary, this concludes the proof for real-valued functions /. In the 
case of complex scalars, the result for complex-valued f follows from it by considering 
real and imaginary parts separately. 


Proof of Theorem 4.63, part (2) Since X is separable, we may pick a dense sequence 
(Xn)n>1 in X. For every integer k > 1 the open balls B(x;,; i) n> 1, cover X. Fixe > 0 
and choose the integers Nz > | such that 


(LU atip)) > 1%. 


By Proposition 4.65, this implies that for all large enough j, say for j > jo, we have 


E 


1j(U Bln $) > I= =: 
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The set 


is closed and totally bounded. The completeness of X therefore implies that K is com- 
pact. Moreover, for j > jo, 


Mz Me 
uj(CK)<) uj (C U Blin: )) <i (c U Ban: 1) <y = 2K 
k>1 n=1 k>1 n=1 k>1 


Problems 


4.1. A bounded operator is said to be of finite rank if its range is finite-dimensional. 
Show that every finite rank operator T € 2(X,Y) is of the form 


N 
Tx= Mee are xEX, 


n=1 


for certain y, € Y and x7 € X*. 
4.2 Consider an open set D C C and a complex Banach space X. A function f : D— X 
is said to be holomorphic if for all zo € D the limit 


tim LO — Le) 
ZZQ) Z— ZO 


exists in X. Use the Hahn—Banach theorem to prove that the Cauchy theorem and 
the Cauchy integral formula hold for holomorphic functions f : D — X defined 
on an open set D in C: 


1 
es dz=0 
20i hewn? : 


and 


: / Fe) dz = f (Zo). 


201 J {\z-z9|=r} Z— ZO 


Here it is assumed that zo € D and r > 0 is so small that {|z—zo| =r} is contained 
in D; this contour is oriented counterclockwise. 


43 Letl<p,q<o@ satisfy ; = =1. 
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4.4 


4.5 


4.6 


4.7 


Duality 


(a) Prove that the identification (L?(Q))* = L4(Q), remains true in the nono- 
finite case if 1 << p<, 
Hint: Given a functional @ € (L?(Q))* there is a sequence (fy; )n>1 in L?(Q) 
such that ||@|| = sup,5;|(fn,)|. The o-algebra generated by this sequence 
is O-finite. 

(b) Show, by way of example, that part (a) does not extend to p = 1. 

Prove that if X* is separable, then X is separable. 

Hint: There is a sequence (X;)n>1 in X such that ||x*|| = sup,s, |(%n,x*)| for a 

countable dense set of functionals x* in X*. 


Let X be a Banach space. 


(a) Show that for all nonzero x* € X* we have an isomorphism of Banach spaces 
X/N(x*) ~K, 


where N(x*) := {x EX : (x,x*) =O}. 
(b) Let Q: X — X/N(x*) denote the quotient mapping. Prove that if ||x*|| = 1, 
then for all x € X the following identity holds: 


|| Ox|| = |(,")I. 
Hint: For the inequality ‘<’, begin by showing that for any 0 < € < 1 there 
must exist c € K and y € X such that ||cx+-y|| = 1 and |(cx+y,2*)| > 1—-e. 


Let Y be a proper closed subspace of X and let x9 € X \ Y. As in Corollary 4.12, 
on the span Xo of Y and xo define @(y) := 0 for all y € Y and @(xo) := 1. It was 
shown in the corollary that @ is bounded. Show that its norm is given by 


1 
d(xo,¥)/ 


IP |lxe = 


Let X be a Banach space. For a set A C X and an element x € X we denote by 
d(x,A) = infyea ||x — y|| the distance from x to A. 


(a) Let X be a Banach space, let Xp C X be a proper closed subspace, and let 
x € X\ Xo. Prove that there exists an x* € X* with ||x*|| = 1 such that (x,x*) = 
d(x,Xo) and x*|y, =0. 

Hint: Let Y = span(Xo,x). Prove that the mapping xp : Y > K given by 


Xo(xXo + tx) := td(x,Xo), x0 € Xo, t€ K, 


is linear, belongs to Y*, has norm ||x$||y« = 1, and satisfies x5|x, = 0. Apply 
the Hahn—Banach theorem to extend x) to a functional on X. 
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(b) Using the result of part (a), show that there exists x* € (L°(0,1))* such that 


Fie )= fp fled, feclo,t, 


(0.1) 


but 


(Loo,4).%°) x [, fone 


4.8 We take a look at Banach spaces containing — and contained in — £™. 


(a) Let X be a Banach space, Y be a closed subspace of X, and let Ty : Y > ¢° be 
a bounded operator. Show that there exists a bounded operator T : X + ¢* 
such that T|y = 7p and ||7|| = ||7o||. 

Hint: Apply the Hahn—Banach theorem ‘coordinatewise’. 

(b) Using the result of part (a), prove that if a closed subspace Y of a Banach 
space X is isomorphic to ¢”, then Y is complemented in X. 

(c) Show that every separable Banach space X is isometrically isomorphic to a 
closed subspace of £°. More precisely, show that if X is a separable Banach 
space, then there exists a closed subspace Y of €* and an isometric isomor- 
phism 7 from X onto Y. 

Hint: Use the Hahn—Banach theorem in combination with the separability to 
make a clever choice of a sequence of functionals (x*),>1 in X*, and consider 
the mapping T : x +> ((x,4%) not. 
4.9 Find an example of a two-dimensional Banach space X and a functional on one 
of its closed one-dimensional subspaces which has infinitely many extensions to 
a functional on X of the same norm. 
4.10 Let Y be closed subspace of a Banach space X. 


(a) Prove that for all x* € X* we have 


AY) = |x" ly 


Yy*" 


(b) Prove that a functional y* € Y* has a unique extension to a functional in X* 
of the same norm if and only if the annihilator Y+ has the Haar property as a 
closed subspace of X*. 


Recall from Problem 3.4 that a Banach space X is called strictly convex if for all 
norm one vectors xo,x1 € X with x9 Ax; and 0 <A <1 we have ||(1—A)x+ 
Ay|| < 1. This problem shows that if the dual X* of a Banach space is strictly 
convex, then every functional on a closed subspace of X has a unique Hahn— 
Banach extension of the same norm. 

The closed subspace Y is said to have the Haar property if for all x € X \ Y 
there exists a unique y € Y such that d(x,Y) = ||x—y]|. 
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4.11 


4.12 


4.13 


4.14 


4.15 


4.16 


Duality 


(c) Prove that if X is strictly convex, then every closed subspace Y of X has the 
Haar property. 
Let Y be aclosed subspace of a Banach space X and let i: Y — X be the inclusion 
mapping. Show that the adjoint i* : X* — Y* is the restriction mapping. 
Let H; and H2 be Hilbert spaces and X be a Banach space. Suppose T; € 2(M,X) 
and T, € &(H>,X) are bounded operators. Prove that the following assertions are 
equivalent: 


(1) there is a constant C > 0 such that ||7}*x*|| < C||7s'x*|| for all x* € X*; 

(2) Ti(H1) S Th(A2). 

Hint: Begin by showing that without loss of generality we may assume that 7; 
and 7» are injective. 

Letl <p<o. 

(a) Let T : L?(0,1) > K be the linear mapping defined by 


1 
rpm [ s(s)ds, f €LP(0,1). 


Show that T is bounded and find an expression for T*. 
(b) Let T : L?(0,1) — L?(0, 1) be the linear operator defined by 


(Tf)(t) = [ sojes, 1€ (0,1), fe L?(0,1). 


Show that T is bounded and find an expression for T*. 
(c) Let T be the linear operator of part (b), now viewed as an operator from 
L? (0, 1) into C[0, 1]. Show that T is bounded and find an expression for T*. 


This problem presents a way to evaluate the norm of a d x d matrix A with com- 
plex coefficients, viewed as a bounded operator on C4. 


(a) Prove that all eigenvalues of A*A are nonnegative real numbers and that the 
eigenspaces corresponding to different eigenvalues are orthogonal. 
(b) Conclude from Proposition 4.28 and part (a) that 


|All? = max{A > 0: A is an eigenvalue of A*A}. 


Let Ho be a closed subspace of H and let i: Hp — H be the inclusion mapping. 
Show that the adjoint i* : H — Hp is the orthogonal projection in H onto Ho, 
viewed as a mapping from H to Hp. 
Let T be a bounded operator on H. 


(a) Show that if T is a contraction (that is, ||7|| < 1), then for each x € H we 


have Tx = x if and only if T*x = x. Conclude that H admits an orthogonal 
direct sum decomposition 


H =N(I-T)@®R(I-T). 
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4.18 


4.19 


4.20 


4.21 


4.22 


4.23 
4.24 
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Hint: If Tx = x, show that T*x — x _L x and deduce that T*x = x. 
(b) Let T be a contraction and define 


Using Proposition 4.31 and the result of part (a), prove that 


‘ if x €N(I—T), 


lim $,x = 
0 ifx LNU-T). 


noo 
Let K be a compact Hausdorff space and let @ € (C(K))* be an element with the 
following two properties: 


(i) 9) = 1; 

Gi) 9(fg) = O(f)0(g) for all f,g € C(K). 
Prove that @(f) = f(x) for some x € K. 
Hint: Show that @, as an element of M(K), is supported on a singleton. 
Find the extreme points of the closed convex set 


c={feL'(0,1): f 20, |< 1}. 


Let C be a closed convex subset of a separable Banach space X. Prove that there 
exists a sequence (x})n>1 of norm one elements in X* and a sequence (Fy,)n>1 of 
closed sets in KK such that 


C= \4irex te, er hs 
n>1 
Hint: Separate C from the elements of a dense sequence in its complement CC 
using the Hahn—Banach separation theorem. 
Let X and Y be Banach spaces. Prove the following assertions: 


(a) a linear operator T : X — Y is continuous with respect to the weak topologies 
of X and Y if and only if it is bounded; 

(b) a linear operator S': Y* — X* is continuous with respect to the weak* topolo- 
gies of Y* and X* if and only if it is the adjoint of a bounded operator 
T:X-Y. 


Prove that the weak topology of a Banach space X coincides with the norm topol- 
ogy if and only if X is finite-dimensional. 

Prove that co, C[0, 1], C,(D), and L'(Q) are norm closed and weak* dense in ¢*, 
L”(0, 1], L*(D), and M(Q), respectively. 

Prove that the linear span of the Dirac measures is weak* dense in M(R). 

Find an example of a sequence (x*),>1 in a dual Banach space X* with the fol- 
lowing two properties: 
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4.25 


4.26 


4.27 


4.28 


4.29 


4.30 


Duality 


(i) there exists an x* € X* such that limy_,.0(x,4,) = (x,x*) for all x € X; 
(ii) no sequence contained in the convex hull of (x*),51 converges to x* with 
respect to the norm of X™*. 


Compare with Corollary 4.33. 

Prove the following converse to Proposition 4.52: If the weak* topology of the 
closed unit ball of a dual Banach space X* is metrisable, then X is separable. 
Hint: Complete the details of the following argument. Let d be a metric which 
induces the weak* topology of By+. Then (By«,d) is a compact metric space and 
therefore the Banach space C(By+,d) is separable by Proposition 2.8. Now ob- 
serve that X is isometrically contained in C(By+,d) in a natural way. 

Let X be a Banach space. 


(a) Let x1,...,xy € X and c),...,cy € K and a constant M > 0 be given. Prove 
that the following are equivalent: 


(1) there exists an x* € X* such that ||x*|| <M and 
(Xn, Xx") =Cn, n=1,...,N3 
(2) for all Ay,...,Anw € K we have 


| y AnCn 
n=1 


(b) Use the Banach—Alaoglu theorem to extend the result of part (a) to infinite 


N 
< M|| y AnXn 
n=1 


sequences (Xp )n>1 and (Cp) n>1- 


Show that the weak topology of a weakly compact subset of a separable Banach 
space is metrisable. 

Using the result of the preceding problem, show that if K is a weakly compact 
subspace of a Banach space, then every sequence (x,)n>1 contained in K has a 
weakly convergent subsequence. 

Using the result of the preceding problem, show that C[0, 1] and L!(0, 1) are non- 
reflexive by checking that their closed unit balls contain sequences that fail to 
converge weakly. 

As an application of the Banach—Alaoglu theorem, prove that there exist function- 
als @ € (¢*)* such that for all x = (xn)n>1 € £° we have: 


(i) (x,@) > O whenever x > 0; 
(ii) (x,@) = (Sx,@), where S: (%))n51' (n41)n>1 is the left shift; 
(iii) (x, 6) = limy 00%, Whenever limy 0X exists. 
Functionals with these properties are called Banach limits. 
Hint: Consider the functionals @,(x) := tyr cee 
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4.31 Let (%n)n>1 be the sequence in &” defined by x, = (0,...,0,1,1,...),n 21. 
—— 
n times 
(a) Show that this sequence has no weakly convergent subsequence. 
Hint: Use the result of Problem 4.30. 
(b) Why doesn’t this contradict the Banach—Alaoglu theorem? 


4.32 Let X be a separable Banach space and let Xo be a closed subspace of X isomor- 
phic to co. Our aim is to show that Xo is complemented in X. 


(a) Use the Hahn—Banach theorem and the definition of an isomorphism to show 
that there exists a bounded sequence (x*),>1 in X* such that for all y € co and 
n> 1 we have (jy,x,) = y,, where j : co > Xo is the isomorphism mapping 
co onto Xo. 

Hint: Consider the adjoint of the operator j~! : X > co. 

(b) Suppose that x* € X* is such that lim, —..(x,%;,) = (%,»*) for all x € X and 
some subsequence (xp, )k>1 Of (Xp, )n>1- Show that x* € X@., that is, (xo,x*) = 
0 for all x9 € Xo. 

(c) Use the Banach—Alaoglu theorem and part (b) to deduce that 

lim d(x*,Xo°) = 0. 


noo 


(d) Suppose that ||x;|| < R for all n > 1. Use Proposition 4.52 and part (c) to con- 
clude that there exists a sequence (y*)n>1 in B(0;R) such that limy—.0(x, x7, — 
ys) =0 for all x € X and (xo, y,) = 0 for all x9 € Xo and alln > 1. 

(ce) Show that the mapping P : x ((x,x% —y*))n>1 is well defined and bounded 
from X into co and that jo P is a projection in X whose range equals Xo. 


4.33 Show that ¢' has the Schur property: If lim, 5.0%, =x weakly in ¢', then 
lim, 0X) =x strongly in @ 1. 

4.34 Let (Q,.¥,) be a probability space. Let (f,)n>1 be a bounded sequence in L! (Q) 
which is uniformly integrable, that is, 


lim sup ||1 =0. 
pa el (fal>ryF lt 


Show that (f)n>1 contains a weakly convergent subsequence by completing the 
details of the following argument. 


(a) Fork =1,2,... the sequence defined by fl = 1(\¢,|ex} fn contains a subse- 
quence that is weakly convergent in L*(Q), and hence weakly convergent in 
L'(Q). Denote by f“ their weak limits in L'(Q). 

(b) Show that || f — f |); <liminf,_,. || 66 — f( |]; and the latter tends to 0 
by uniform integrability. 

(c) Conclude that the limit limy_,.. f = f exists in L! (Q) and that limysc. fy = 
f weakly in L'(Q). 
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4.35 
4.36 
4.37 


4.38 


4.39 


4.40 


4.41 


4.42 


4.43 


Duality 


Deduce Theorem 4.46 from Theorem 4.47. 
Prove the various identifications made in the discussion following Example 4.57. 
Prove that if Y is a closed subspace of a reflexive Banach space X, then the quo- 
tient space X/Y is reflexive. 

Show that if X is a Banach lattice, then an element x € X satisfies x > 0 if and 
only if (x,x*) > 0 for all x* € X* satisfying x* > 0. 

Show that if X is a Banach lattice and x* € X* satisfies x* > 0, then for all x E X 
we have the following assertions: 

(a) (xt, x*) =sup{(x,y*) : O< y* <x*}; 

(b) ([x],x*) = sup{(x,y*) = |y*] <x*}. 
Under the assumptions of Theorem 4.2, let u € Mp(X) represent the functional 
@ € (Co(X))*. Show that the measures representing @+, @~, and |@|, are UT, U, 
and ||, respectively. 
For 1 < p < consider the space ¢?([0, 1] introduced in Problem 2.34. 


(a) Show that there is a natural isometric isomorphism (¢?[0,1])* ~ 40, 1], 
where i+ ; =1. 
(b) Show that the function F : [0, 1] > ¢?[0, 1] given by 
Lo St 
F(t))(s)=4 ” ; 
(F(t))(s) fs ~ 
has the following properties: 
(i) t+ (F(t), g) is measurable for all g € ¢2(0, 1]; 
(ii) t+ F(t) fails to be strongly measurable. 


Let (An)n>1 be a sequence of disjoint intervals of positive measure |A,,| in the 
interval [0,1] and define f : [0,1] + co by 


where (un)n>1 is the sequence of standard unit vectors of co. 


(a) Show that f is strongly measurable. 
(b) Show that for all x* € cs = ¢! the integral fj (f(t),x*) dt is well defined. 
(c) Show that f fails to be Bochner integrable. 
Consider the mapping f : (0,1) + L*(0, 1) given by f(r) := 1o,). 
(a) Show that (f,x*) is measurable for all x* € (L”(0,1))* 
Hint: Monotone scalar-valued functions are measurable. 
(b) Show that f fails to be Bochner integrable. 


5 
Bounded Operators 


In the first chapter, bounded operators have been introduced and some of their basic 
properties were established. This chapter treats some of their deeper properties. In Sec- 
tions 5.1-5.3 we begin with three results, each of which expresses a certain robustness 
property of the class of bounded operators: the uniform boundedness theorem (Theorem 
5.2), the open mapping theorem (Theorem 5.8), and the closed graph theorem (Theo- 
rem 5.11). Completeness plays a critical role through their dependence on the Baire 
category theorem. In Section 5.4 we present the fourth main result of this chapter, the 
closed range theorem (Theorem 5.16). 

As simple as the definition of a bounded operator may seem, in practice it can be 
quite hard to establish boundedness of a given linear operator. This applies in particular 
to some of the most important operators in Analysis, such as the Fourier—Plancherel 
transform and the Hilbert transform. Their properties are studied in fair detail in Sec- 
tions 5.5 and 5.6. The final Section 5.7 discusses the Riesz—Thorin interpolation theorem 
(Theorem 5.39) and its companion, the Marcinkiewicz interpolation theorem (Theorem 
5.47). 


5.1 The Uniform Boundedness Theorem 


The proof of the uniform boundedness theorem, as well as the proofs of some other 
results in this chapter, depend on the Baire category theorem. 


5.1.a The Baire Category Theorem 


Theorem 5.1 (Baire category theorem). Let X be a nonempty complete metric space. 
Let F,,Fo,... be closed subsets of X such that 


X=. 


n>1 
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Then at least one of the sets F,, has nonempty interior. 


Proof Assuming that all sets F,, have empty interior, we prove the existence of anx € X 
not contained in any one of the F,,’s. 

Pick an x, € CF,. This is possible, for otherwise we have F; = X and F; contains 
open balls. Since F; is closed, CF, is open and therefore contains an open ball B(x1;11). 
By shrinking the radius a bit, we may even assume that the closed ball B(x,;r1) is 
contained in CF, and, moreover, that 0 <r; < 1. The ball B(x13r1) is not contained in 
F, and consequently the open set B(x1;r1) \ F2 is nonempty. By the same reasoning as 
before, this set contains a closed ball B(x23r2) with radius 0 <r. < 5. Continuing in 
this way we obtain a decreasing sequence of closed balls B(x1;r1) D B(x23r2) D... with 
0<m< 1. The sequence (X,)n>1 is a Cauchy sequence, and therefore has a limit x, by 
the completeness of X. It is clear that x € 1,3; B(%n3/n), and therefore x ¢ U,.5 Fn- 


5.1.b The Uniform Boundedness Theorem 


The uniform boundedness theorem infers uniform boundedness of a family of bounded 
operators from their pointwise boundedness. 


Theorem 5.2 (Uniform boundedness theorem). Let (T;)je be a family of bounded op- 
erators from a Banach space X into a normed space Y. If 


sup ||Tix||<o, xeX, 
iel 


then 


sup ||Ti|| < °. 
iel 


Proof For each i € I the sets {x € X : ||Tjx|| <n} are closed by the continuity of the 
operator 7;. Since the intersection of closed sets is closed, the sets 


Fy := {x EX: sup||Tjx|| < n} = ()\{xex : ||Txl| < n} 
iel icl 

are closed. Moreover, their union equals X. By the Baire category theorem, at least one 
of them, say F;,,, has nonempty interior. Accordingly there exist x9 € X and ro > 0 such 
that B(xo; ro) C Fy: 

Fix an index i € J. For any x € X with norm ||x|| < ro we write x = xo — (xo — x) and 
note that both xo and xo — x belong to B(xo;r9). As a consequence, 

Il Z:x|] < || Tixol] + ||Zi@xo — x)|] < no +10 = 270. 


Hence, for all x € X with norm ||x|| < 1, 


\|T;x|| < 2no/ro. 
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This implies that ||7;|| < 2no/ro. This being true for all i € J, we have shown that 
suPjer||Til] < 2n0/7o. 


We continue with some typical applications. 


Proposition 5.3. Let X and Y be Banach spaces and suppose that T,,: X — Y,n=> 1, are 
bounded operators such that lim). 7,x =: Tx exists for all x € X. Then the mapping 
x ++ Tx is linear bounded from X into Y and 


| T\| <timiné Tl. 
n-oo 


Proof The uniform boundedness theorem implies sup, ||T;,|| < °° and therefore we can 
apply Proposition 1.19. 


Proposition 5.4 (Boundedness of bilinear mappings). Let X,Y,Z be normed spaces and 
suppose that at least one of the spaces X and Y is a Banach space. Let B:X x Y > Z 
be linear and bounded in both variables separately. Then there exists a constant C > 0 
such that 


|B(x,y)I| <Cllallllyll, xe X, yeY. 
In particular, B is jointly continuous. 


Proof Assume that X is a Banach space (if Y is a Banach space we interchange the 
roles of X and Y). For each y € Y, T,x := B(x,y) defines an element of “(X,Z) since B 
is bounded in its first variable. Also, for each x € X we have supy)) <j ||Zyx|| < ee since 
B is bounded in its second variable. Since X is a Banach space, the uniform bound- 
edness theorem shows that {J,: |ly|| < 1} is uniformly bounded in #(X,Z). With 
M := sup\jy\\ <1 ||Zy|] we then obtain, for all y € Y with ||y|| < 1, 


BO. y)Il = !5-ll < Mh). 


By a scaling argument for the second variable, this implies the claim as stated. 


The same proof works if we assume that B is linear in the first variable, conjugate- 
linear in the second variable, and bounded in both variables separately; this observation 
will be useful in the context of Hilbert spaces. 

The following proposition and its corollary give an application of the uniform bound- 
edness theorem to duality. 


Proposition 5.5 (Weakly bounded sets are bounded). A subset S of a Banach space X 
is bounded if and only if it is weakly bounded, that is, the set (S,x*) := {(x,x*) : x € S} 
is bounded for all x* € X*. 


Proof Only the ‘if’ part needs proof. Suppose that S is weakly bounded. Its image J(S) 
in X** under the natural isometric embedding J : X — X** (see Section 4.7.b) is a family 
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of bounded operators from X* to K, and by assumption these operators are pointwise 
bounded. By the uniform boundedness theorem, these operators are uniformly bounded 
in operator norm. This is the same as saying that J(S) is a bounded subset of X**. Since 
J is isometric, it follows that S is bounded. 


Corollary 5.6. Let X be a Banach space. The following assertions hold: 


(1) if limp yoo (x, x7) = (x,x*) for all x € X, then (x*)n>1 is bounded; 
(2) if litmy—oo(Xn x") = (x,x*) for all x* © X*, then (Xn)n>1 is bounded. 


Proof Part (1) follows directly from the uniform boundedness theorem and part (2) is 
a special case of Proposition 5.5. 


5.2 The Open Mapping Theorem 


The next main theorem is the open mapping theorem. Among other things it implies that 
a bijective bounded operator between Banach spaces has a bounded inverse (and hence 
is an isomorphism). Its proof relies on the following lemma, in which we use subscripts 
to tell apart open balls in X and Y. 


Lemma 5.7. Let X be a Banach space, Y a normed space, and let T € (X,Y) bea 
bounded operator. If 0 < r,R < © are such that 


By (O;r) C T(Bx (0;R)), 
then 
By (0;r) C T (Bx (0;2R)). 


As is apparent from the proof, the constant 2 may be replaced by 1 + € for any fixed 
e>0. 


Proof Fix an arbitrary yo € By(0;r). Then yo € T(Bx(0;R)), so we can write 
yo = Tx, +y1 with x; € By (0;R) and |ly1|| < sr. 


Then 2y; € By(0;r), so 


2y, =Tx2+y2 with x2 € By (0;R) and ||ya|| < sr. 


Then 2y2 € By(0;r), so 


2y2 = Tx3+y3 with x3 € By (0;R) and ||y3|| < sr. 
Continuing this way, for all N € N we obtain 


yo=Tx1+y1 
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= Tx, +3Txm+35y2 


1 1 1 
= Tx, xT x2 t qix3 43 


= Tx, +47 x2 + 47x34+-+++ aT xsi + N+ 
2 2 
Clearly, limy_+.o SWYN+1 = 0 and 


1 -— R 

>: 5 e+ Il 2S ET 2R. 
k=0 k=0 
This implies that the sum ))_5 aX converges in X, by the completeness of X. The 
boundedness of T implies that the sum )7"_5 we T Xk+ 1 converges in Y to TY; SEX» 
and therefore 


N- 


eT 1 = A 
yo = lim (2 ag T ett) of awn) = re 5ertk+1 € T(Bx(052R)). 


Theorem 5.8 (Open mapping theorem). Let X and Y be Banach spaces. IfT € £(X,Y 
is bounded and surjective, then T maps open sets to open sets. 


Proof Set F, := T(Bx(0;n)). The surjectivity of T implies that Y = U,5) Fn. There- 
fore, by the Baire category theorem (Theorem 5.1), some F;,. has nonempty interior. 
This means that there exist yo € Y and ro > 0 such that 


By (yo3ro) & T(Bx (0;n0)). 
In view of T(—x) = —Tx, we then also have 
By (—yo3r0) © T (Bx (0;n0)). 


Writing y= 5 (v0 +y)+ 5(—Yo +y), it follows that 


By (070) © 5 -T (Bx (0;10)) + 5 - T (Bx (0;10)) = T (Bx (0;70)). 
Now we can invoke the lemma and find 
By (0;ro) C T (Bx (0;2n0)). 


Let U be an open set in X; we wish to prove that T(U) is open. To this end let 
Tx € T(U) be given, with x € U; we wish to prove that T(U) contains the open ball 
By(Tx;p) for some p > 0. 

Since U is open, there is an € > 0 such that By(x;€) C U. Let 6 := €/(2ng). Then 
T(U) contains Tx + T(By(0;€)) = Tx+ dT (Bx(0;2n0)), and the latter contains the 
open ball Tx+ dBy(0;r9) = By (Tx; 670). 
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xX > VY 


Figure 5.1 Proof of the closed graph theorem 


Corollary 5.9. Let X and Y be Banach spaces. If T € L(X,Y) is bijective, then T is 
an isomorphism. 


Proof The fact that T maps open sets to open sets can now be reformulated as saying 
that (T~!)~!(U) is open for every open set U, so T~! is continuous. 


5.3. The Closed Graph Theorem 
Let X and Y be Banach spaces. The graph of a mapping T : X — Y is the set 
G(T) := {(x, Tx): xe X} 


in X x Y. If T is linear, G(T) is a linear subspace of X x Y. Endowing X x Y with the 
norm 


II(x,y)Ila == [ell + III (5.1) 


turns this space into a Banach space and it is easy to check that if T is bounded, then 
G(T) is closed X x Y. Since all product norms on X x Y are equivalent (see Example 
1.32), the particular choice of product norm made in (5.1) is immaterial. 


Definition 5.10 (Closed operators). A linear operator T : X — Y is closed if its graph 
is closed in X x Y. 


Theorem 5.11 (Closed graph theorem). Let X and Y be Banach spaces. If a linear 
operator T : X — Y is closed, then T is bounded. 


Proof Let ay :X x Y—X and ay :X x Y -Y be given by ay(x,y) :=x and my (x,y) := 
y. Both mappings are bounded and of norm one. By assumption Z := G(T) is a closed 
subspace of X x Y, hence a Banach space with respect to the inherited norm. Con- 
sider the linear operator S : X — Z given by Sx := (x, Tx). This operator is a bijection 
whose inverse S~! is the bounded operator zy. By Corollary 5.9 the inverse S$ of S~! is 
bounded. Hence also T = zy oS is bounded. 
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It was noted in Proposition 4.15 that if a subspace Xo of a normed space X is the 
range of a projection in X, then Xp is complemented in X. As an application of the 
closed graph theorem we prove the following converse: 


Proposition 5.12. A closed subspace of a Banach space X is complemented if and only 
if it is the range of a projection in X. 


Proof \t remains to prove the ‘only if’ part. Let X = Xp © X be a direct sum decom- 
position and consider the mapping 7% : (x0,x1) > (xo,0); here we suggestively write 
(xo,x1) for the element xo +x; of X. To prove its boundedness, we shall prove that its 
graph is closed. Suppose that (x§,x7) + (x0,*1) and (2,0) > (vo, y1) in X. Then also 


(0,7) = (20,21) — (40,0) > (0,41) — (Yo,91) = 0 — Yo.41— Yi) 


in X. The closedness of Xo and Xj in X implies that (yo,y1) = limp_,..(4j,0) belongs to 
Xo and (xo — yo, X1 — y1) = lim,_,..(0,x7) belongs to X1. This forces yp = xp and y; = 0. 
It follows that (yo,y1) = (x0,0) = %o(x0,%1). This proves that 2 is closed. Therefore, 
by the closed graph theorem, 7 is bounded. In view of fie = 7%, we conclude that 7 is 
a projection. Its range equals Xo. 


As a second application we prove that if we have a direct sum decomposition X = 
Xo @X, then the dual space admits a direct sum decomposition which, somewhat infor- 
mally stated, is given as X* = Xp @.X7. 


Proposition 5.13. Let X be a Banach space with direct sum decomposition X = Xo ® X, 
and associated projections % and 1%. Then X* admits the direct sum decomposition 
X* = Xj OX}, where Xj and Xj are isometrically identified with the ranges of the 
adjoint projections %} and mf in X* by restriction. 


Proof The adjoint operators 2) and 7; are projections and accordingly we have the 
direct sum decomposition X* = R(m) @ R(zj). We must prove that the restriction 
mappings 7x* +> (m9x*)|x) and mjx* +> (a]{x*)|x, establish isometric isomorphisms 
of R(ap) and R(zj) onto X} and X/ respectively. 

It is clear that the restriction (%)x*)|x, defines an element of Xj and 


I|(70")Ixolly. = sup |(x0,%9x")| = sup |(7x0,2")| 
° XoEXo x9EXo 
I|xo||=1 Ilxol|=1 
= sup |(%ox,x")| = ||" [lx-- 
xEX 
I|x|=1 
In the same way we have ||(7}x*)|x, |[x¢ = ||712"||x*. This proves that the mappings 


TM xX* +> (MHx*)|x, and 2/'x* +> (afx*)|x, are isometric. To prove their surjectivity, fix 
7 € {0, 1} and let x; © X; be arbitrary. If x" € X* is a Hahn—Banach extension of xj, for 


all x; € Xj we have (xj,x;) = (7jx;,x*) = (x;,%7), and therefore x} = 17x*|x,. 
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As a final application of the closed graph theorem we prove the following variation 
of Proposition 2.26. 


Proposition 5.14. Let 1 < p,q < © satisfy ; + 7 = 1. Let (Q,U) be a measure space, 
which is assumed to be 0-finite if p =. A measurable function f belongs to LP (Q.) if 
and only if fg € L'(Q) for all g € L4(Q). In that case we have 


I[fllp= sup | |fgidu. 
IIglg<i¥@ 

Proof The ‘only if’ part is immediate from Hdélder’s inequality. To prove the ‘if’ part 
we may assume that f is not identically 0. Using Corollary 2.21, the mapping g++ fg 
is easily seen to be closed as a mapping from L7(Q) to L'(Q): if gn > g in L4(Q) and 
fan > hin L'(Q), we may pass to a subsequence such that gp, — g and fgn, — h p- 
almost everywhere, and therefore fg = limy+.. fg) = h U-almost everywhere. By the 
closed graph theorem, the operator g ++ fg is bounded. It follows that the assumptions 
of Proposition 2.26 are satisfied, with M the norm of the operator g++ fg. This gives 
that g € L1(Q) with bound 


Ill < sup ff Lfsldu. 


IIgllg<t 


Holder’s inequality gives the opposite bound. 


5.4 The Closed Range Theorem 


As a warm-up for the main result of this section we begin with a simple application of 
the Hahn—Banach theorem. Recall that the annihilator of a subset A of X is the set 


At := {x* EX*: (x,x*) = 0 forall x€ A} 
and the pre-annihilator of a subset B of X™* is the set 
+B:={xEX: (x,x*) =0 for all x* € B}. 
Proposition 5.15. For any operator T © &(X,Y), where X and Y are Banach spaces, 
we have 
R(T) = ~(N(T")). 
In particular, T has dense range if and only if T* is injective. 


Proof C: If y=Tx € R(T), then for all y* € N(7*) we have (y,y*) = (x, T*y*) =0, 
and therefore y € +(N(7*)). This proves R(T) C +(N(T7*)). The result now follows 
from the fact that +(N(7*)) is closed. 
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D: If x9 ¢ R(T) =: Yo, by Corollary 4.12 there exists a y* € Y* such that (xo, y*) 40 
and y*|y, = 0. For all x € X, (x, T*y*) = (Tx,y*) = 0 and therefore y* € N(T*). Since 
(xo,¥*) 40 we have xo ¢ +(N(T*)). 


In Sections 7.2 and 7.3 we will encounter interesting classes of operators whose 
ranges are closed. For such operators, the closed range theorem provides a ‘dual’ variant 
of Proposition 5.15 which is considerably harder to prove. We need this theorem in our 
discussion of duality of Fredholm operators in Section 7.3. 


Theorem 5.16 (Closed range theorem). Let the operator T € £(X,Y) be given, where 
X and Y are Banach spaces. If R(T) is closed, then 


R(T*) = (N(T))*. 
As a consequence, R(T*) is weak* closed. 


Proof Once we have proved the identity R(T*) = (N(T))+, the weak* closedness of 
R(T*) follows from the general observation that annihilators are weak* closed. 


C: If x* = T*y* € R(T*), then for all x € N(7) and we have (x,x*) = (Tx,y*) =0. 
This shows that x* € (N(7))+ 


2: Suppose that x* € (N(T))+. For elements y = Tx € R(T) we define 
$(y) == (42°). 


To see that this is well defined, suppose that we also have y = Tx’. Then T(x —x’) = 
y—y =0 implies x—.x’ € N(T) and therefore (x — x’,x*) = 0. This gives the well- 
definedness as claimed. 

For all z € N(T) we have $(y) = (x—z,x*) and therefore 


|o(y)] < le alll" I. 
By taking the infimum over all z € N(7) we obtain 
1o(y)| < dx, N(T))|l¥"I)- 


We claim that the closedness of the range of T implies the existence of a constant C > 0 
such that 


d(x,N(T)) < C\|Tx||. (5.2) 


Taking this for granted for the moment, we first show how to complete the proof. From 
(5.2) we obtain the estimate 


|o(y)] < Cl] Tx] [he" I] = Clly ll", 
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proving that @ is bounded as a functional defined on R(T). By the Hahn—Banach theo- 
rem, we obtain an element y* € Y* extending @. For all x € X we obtain 


(%; T*y*) = (Tx,y*) = (Tx) 7 (ame) 
sox* = T*y* € R(T*). 
It remains to prove (5.2). For this we note that 
d(x,N(T)) = |x +N(T)|lx nz): (5.3) 


The operator T induces a well-defined and bounded quotient operator T, from X /N(T) 
onto R(T) by setting T(x + N(T)) := Tx. It is clear that T, is injective and surjective, 
and since R(T) is closed we may apply the open mapping theorem and obtain that T, 
maps X/N(T) isomorphically onto R(T). Denoting by C the norm of its inverse we 
obtain the desired estimate from (5.3) and 


lx +N(T)llxyncr) S CllZ/(x + N(T))|| = Cl] Tx]. 


This completes the proof of (5.2). 


5.5 The Fourier Transform 


In the present section and the next we study two nontrivial examples of bounded op- 
erators: the Fourier—Plancherel transform and the Hilbert transform. It is not an exag- 
geration to state that, at least from the point of view of the theory of partial differential 
equations, these rank among the most important bounded operators in all of Analysis. 


5.5.a Definition and General Properties 


Definition 5.17 (Fourier transform). The Fourier transform of a function f € L'(R®) is 
the function f : R? — C defined by 


AE) = Goan [Seder Bae, EER! 6.4) 


where x-& := ye 4 aie 


It is evident that f € L”(R¢) and ||fl|.. < (2x)~4/?||f||1. This shows that the oper- 
ator F : f + f, which will be referred to as the Fourier transform, defines a bounded 
operator from L!(R“) to L*(R“). 


Remark 5.18. In certain situations it is useful to absorb the constant (27)~4/? into the 


measure. Denoting the resulting normalised Lebesgue measure by 


dimn(x) = (2n)~4/? dx, 
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one may interpret the Fourier transform as the operator from L!(R4,m) — L*(R¢,m) 
given by 


=f, F(x)exp(—ix-E)dm(x), EER? 


The advantage of this point of view is that this operator is contractive. In many applica- 
tions, however, working with the normalised Lebesgue measure is somewhat artificial, 
and for this reason we stick with 5.4 most of the time. 


The dominated convergence theorem implies that for all f € L'(R®) the function f 
is sequentially continuous, hence continuous. More is true: the following lemma shows 
that f belongs to Co(IR¢), the space of continuous functions vanishing at infinity. 


Theorem 5.19 (Riemann—Lebesgue lemma). For all f € L'(IR“) we have fe Co(R4). 


Proof By separation of variables one sees that limj¢)_,.. IF(E)| = 0 for step functions 
f =ZYE, cio, where n > 1, c; € C and Q; cubes with sides parallel to the coordinate 
axes (1 <i <n). Indeed, if O= Ilj- ,[a;,0;] is such a cube, then 


Bs 1 Z 
f(§) = (On oom fe TD Maja) exp(—insb) ae 


j=l 


f exp(i(bj — 4j)6j) — 
~ Qn d/2; Tost ibj ii) z ! 


) 
Ie ee 
= tagyara Lg, exp iaub,) — exp(—ib 63) 
1 
) 
If |§| > r, then at least one coordinate satisfies |§j,| > r/vd and then 


where the constants 


M;= sup 


are finite. This proves that f € Co(IR“) as claimed. 

Since the functions of the form considered above are dense in L!(R) by (the proof 
of) Proposition 2.29, and since Co(R%) is a closed subspace of L*(IR“), by Proposi- 
tion 1.18 the Fourier transform extends uniquely to a bounded operator from L! (R@ 
to Co(IR“). Identifying Co(R“) with a closed subspace of L*(IR“), this extension agrees 
with the Fourier transform. 
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We continue with an inversion theorem for the Fourier transform. Its proof is based 
on a simple lemma. 


Lemma 5.20. For A > 0 the Fourier transform of the Gaussian density 


equals 


_—_ 


8G) = aera or (—5IEF/A). 


Proof First let d = 1. Completing squares and using Cauchy’s theorem to shift the path 
of integration, we find 


Gms [8 exo(—ix-E)(a) dr 
= acon (-567/2) - exp(—5A2’) dz 
1 1 2 1 D, 1 1 2 
= 5 P(—38 /2) [o(-34 )de= (nay exp(—56 /2). 


The general case follows from this by separation of variables. 


A different proof based on the Picard—Lindelof theorem is outlined in Problem 5.19. 


Theorem 5.21 (Fourier inversion theorem). If f € L'(R@) satisfies fe L!(R4), then 
for almost all x € R4 we have the identity 


FO) = Ga [,,Flederolir- 8) a8. 


In particular this result implies that the Fourier transform is injective as a mapping 
from L'(R“) to L*(R“). A more general injectivity result will be proved in Theorem 
5.31. 


Proof By Lemma 5.20, the Gaussian function g(x) = (27 ~4/2 exp(—!1x|?) satisfies 
y P\~3 


rae [8G dexp(-ix-) a8 = amps [iy 8S enolix-§)8, 


where the last identity uses that g is real-valued, so that taking complex conjugates 
leaves the expression unchanged. Substituting x/A for x we obtain 


wa |, 82S )explix-8) a8. 


(22)4/2 Jira 


8(x) = B(x) = 


ga(a) =A “4g(A 1x) = 
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By Proposition 2.34 and Corollary 2.21, after passing to an appropriate subsequence 


A; | 0 we have 21; * f(x) + f(x) for almost all x € R¢ as j — 09. Using the above, it 
follows that for almost all x € R@ we have 


f(x) = lim | 8aj(VF(e—y) dy 
770 JR 


= Jim, (rare fa 8(2u8)envliv-§)48) fey) ay 


Aj0 Rd 


ee ee 


n~ 


= fim ars |, 9(Au8) explir- 6) 716) a8 


4;30 (20) 4/2 
= aaars [FE exwter 8), 


where the last step is justified by dominated convergence, which can be used here since 
f is integrable, g is bounded, and g(A;€) — g(0) = 1 pointwise as j — . 


The Fourier transform of the translate 1, f of a function f € L'(R) is given by 


BFE) = Te [Flot Wyexp(—isé) dx = expliné) FE). 
It follows that if f(E)) = 0 for some & € R, 
then 1,f(Eo) = 0 for all h € R. Therefore the 
linear span of the set of translates of f is con- 
tained in {g € L'(R) : @(&) = 0}, which is a 
proper closed subspace of L!(IR). Thus, a nec- 
essary condition in order that the linear span of 
the set of translates of a function f € L'(R) be 
dense in L!(R) is that f be zero-free. Strikingly, 
this necessary condition is also sufficient. This is 


the content of the next theorem, which will be Norbert Wiener, 1894-1964 
proved by operator theoretic methods in Section 
13.1.b. 


Theorem 5.22 (Wiener’s Tauberian theorem). Jf the Fourier transform of a function 
f €L!(R) is zero-free, then the span of the set of all translates of f is dense in L'(R). 


5.5.b The Plancherel Theorem 


The Fourier transform enjoys an important L* boundedness property. 
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Theorem 5.23 (Plancherel, preliminary version). If f € L'(IR¢)L?(R®), then fe 
L?(R¢) and 


lFll2 = |Iflle- 


Proof Since f € L'(R4), f is bounded and € 4 exp(—3A|&|”) |f(E)|? is integrable 
for all A > 0, and 


[exr(-sAl6P) IABP as 

= [exp aa) FlE)F(E) a6 

=o [8 @) [s f(x) exp(—ix-€) ax [7 f(y) exp(iy- ) dy dé 
Thnk o)(E)exp(—i(x—y)E) 48 ) F(x) FO) dedy 


ai Rd aaayn x0(-3 ky */a) F F(x) fly) dedy 


where $y (x) = u~4@(u-!x) with (y) = (22)~4/? exp(—4|y|?); the change of order 
of integration is justified by the absolute integrability of the integrand. Applying Propo- 
sition 2.34 we find that lim) jo f * @/z = f in L?(R“). Then, 


lim [f+ 020)FO)¢y = f., FO)FOV4y = IIB: 
On the other hand, 

1 Kz _~ o~ 

lim [exp(—5Al6P)IAE Pas =f FE)P aS = IAB 


ALO JR: 


by dominated convergence. This completes the proof. 
It is of some interest to consider the vector space 
F(R4) :={feL(RYNL(R): fel (RYNL(R)}. 


There is some redundancy in this definition, for if f € L' (IR?) NL? (R®), then fe 1? (R¢) 
by the Plancherel theorem. The advantage of the above format is that it brings out the 
symmetry between f and a explicitly. The interest of this class is explained by the 
following two observations. 


Lemma 5.24. The Fourier transform maps F7(R¢) bijectively into itself. 
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Proof Injectivity of ftv f follows from the Plancherel theorem and surjectivity from 
the Fourier inversion theorem, which implies if f € F 2(IR¢ ), then f is the Fourier trans- 
form of the function € + f(—&) in F?(R¢). 


Lemma 5.25. ¥7(R“) is dense in L*(R“). 


Proof Since C?(R) is dense in L?(IR“) by Proposition 2.29, it suffices to show that 
every f € C2(R%) belongs to .¥7(R“). Integrating by parts, for all f € C?(IR“) and 
multi-indices «@ € N¢ we have 
d%F(§) = IME FE), 

where 0% := 0," 0---0 00 with 0; the partial derivative in the jth direction, §% := 

pt e+E54, and |a@| := ot) +--+ + Qq. Since Fourier transforms of integrable functions 
are bounded, this implies that € ++ (1+ |&|*) f(&) is bounded for every integer k > 1. 
The desired result follows from this. 


Combining these lemmas with Proposition 1.18, we obtain the following improved 
version of Theorem 5.23. 


Theorem 5.26 (Plancherel). The restriction of the Fourier transform to L! (R4)NL?(R?) 
has a unique extension to an isometry from L” (R“) onto itself. 


Definition 5.27 (Fourier—Plancherel transform). This isometry of L?(IR@) is called the 
Fourier—Plancherel transform. 


With slight abuse of notation we denote the Fourier—Plancherel transform again by 
Fifvo f. It is important to realise that fis no longer given by the pointwise formula 
(5.4). In fact, for functions f € L?(R?) the integrand in (5.4) is not even integrable 
unless f € L'(R7)NL?(R?). 


Remark 5.28. Theorem 5.26 also holds with respect to the normalised Lebesgue mea- 
sure dm(x) = (2)~4/? dx: the restriction to L!(R4,m) 9 L?(R4m) of the Fourier trans- 
form as defined in Remark 5.18 extends to an isometry from L?(R4,m) onto itself. 


For later use we record two further properties of the Fourier—Plancherel transform. 


Proposition 5.29. For all f,g € L”(IR“) we have 
f@)E@)ax = i Fix)e(x) dx. 
Rd Rd 


Proof For f,g € ¥7(IR¢) the identity follows by writing out the Fourier transforms 
and using Fubini’s theorem: 


i 1 : 
[feraeyer= faa [,, Fee(Eexp(—ix- 6) as a 
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= luca a J (a(S exp(—m-5 )dxdé = [Fle )s(G) 4. 


The general case follows by approximation, using that .77(IR“) is dense in L7(IR“) by 
Lemma 5.25. 


The appearance of the factor (2n)4/ > in the next proposition is an artefact of our 


convention to normalise the Fourier transform with this factor, but not the convolution. 
Proposition 5.30. Let f € L'(IR“) and let g € L'(R®) or g € L?(R¢). For almost all 
E ER? we have 

Fr 9(5) = 20)" F(6)8(6). 
Proof If g € C,(R®), then f *g € L'(R¢)L*(R%) by Young’s inequality, and by 
Fubini’s theorem and a change of variables we obtain 


— 1 


f*g(S) = oem [of —¥)8() dy) exp(—ix- €) dx 


Sam gl OME y)-&) dx) exp(—iy-€)g(v) dy 
( 


— 


rae f(u) exp(—iu- &) duc) g(y) exp(—iy- €) dy 


5 )3($). 
This proves the identity for g € C.(IR“). For p € {1,2} and ; + 7 = 1, Young’s inequality 
implies that the L?-function fre depends continuously on the L’-norm of g. Since 


C,(IR“) is dense in L?(IR“) by Proposition 2.29, it follows that the identity extends to 
arbitrary functions g € L?(R¢). 


From Proposition 2.43 we know that if jz is a real or complex measure, its variation 
|| is a finite measure. Accordingly, the Lebesgue integrals of bounded Borel functions 
with respect to u are well defined. In particular we can define the Fourier transform of 
areal or complex Borel measure L on R¢ by 


a(é) := oni [pexw(-ix-6)au(y, EER! 
From 


BEDI < evar fu lel = yall: 


where ||t1|| = || (IR¢) is the variation norm of 1, we see that {ff is a bounded function, 
and it is continuous by dominated convergence. Thus we have ff € Cy(R®) and ||f1 ||. < 
(2m) ~4/?|| ||. The Riemann—Lebesgue lemma does not extend to the present setting, as 
is demonstrated by the identity & =1. 


5.5 The Fourier Transform 187 


The Fourier inversion theorem (Theorem 5.21) implies that the Fourier transform is 
injective as an operator from L!(IR@) to L*(R“). More generally we have the following 
result. 


Theorem 5.31 (Injectivity of the Fourier transform). Jf t is a real or complex Borel 
measure on R¢ satisfying {i(&) = 0 for all € € R4 then up = 0. 


Proof To prove that u = 0 it suffices to show that 
[,fau=0, fec(R. (5.5) 


Indeed, approximating the indicator function of an open rectangle R in R? from below 
by functions in C.(R@) and using dominated convergence (which can be justified by 
decomposing the real and imaginary parts of f into their positive and negative parts 
using the Jordan decomposition (Theorem 2.45)), it then follows that (R) = 0 for all 
such rectangles R, whereupon Dynkin’s lemma (Lemma E.4) implies that 1 = 0. 

Turning to the proof of (5.5), fix 0 < € < 1 and f € C.(R“). We may assume that 
|| flo < 1. Let r > 0 be so large that the support of f is contained in a cube [—r,r]“ satis- 
fying |u| (C[- nr|t ) < €. By the Stone—Weierstrass theorem (Theorem 2.5) there exists 
a linear combination p : R¢ — C of the functions of the form x ++ exp(2mikx- € /2r) 
with € € R¢ and k € Z (that is, a ‘trigonometric polynomial of period 2r’) such that 
SUP <[_p,)¢ |f (4) — p(x)| < €. Then, noting that ||p||..< 1+, 


||, feu < ete iidiul+| [feu 
ee» _“_—_ 


=0 


<fjlf—plami+| {pau 


<e +|/ d -f d 
lall+| [rdw fae 


<ellul+| [edu |+e+e) =ellul|+e(i+e). 


=0 


The equality in the last step follows from the assumption that [i vanishes, as it implies 
Jpa pdu = 0. Since € > 0 was arbitrary, this proves (5.5). 


For later reference we mention that this theorem also admits a discrete version, the 
proof of which is an even more direct application of the Stone—Weierstrass theorem (see 
Problem 5.17). The Fourier coefficients of a real or complex Borel measure 1 on the 
unit circle T are defined by 


au 
jin) = = f exp(—in®)du(@), n€Z. 
20 Jn 
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Theorem 5.32 (Injectivity of the Fourier transform on the circle). Jf is a real or 
complex Borel measure on T satisfying [i(n) = 0 for alln € Z, then p = 0. 


If p is real-valued, it suffices to have {1(n) = 0 for all n € N, for then £(—n) = L(n) 
implies that U(n) = 0 for all n € Z. 


5.5.c Fourier Multiplier Operators 


The Plancherel theorem provides a method for constructing nontrivial bounded opera- 
tors on L?(IR¢) as follows. Given m € L*(R“) and f € L(IR¢), the function 


mf ++ m(E)f(E) 


belongs to L?(IR“) and therefore the same is true for its inverse Fourier—Plancherel trans- 


form (mf) . 


Definition 5.33 (Fourier multiplier operators). For functions m € L®(R“), the bounded 
operator on L?(IR@) defined by 


nw 


Tm: f > (mf) 
is called the Fourier multiplier operator with symbol m. 


The operator T,, is bounded of norm ||Tin|| = |g mg] (22 (Ray) = |||] (cf. Exam- 
ple 1.29). We have the elementary properties 


Tiny +m, = Tin, + Tings Tinym = Tin, © Tn 


Fourier multipliers can be characterised by commutation properties. This fact de- 
pends on the following lemma. 


Lemma 5.34. Let (Q,.4,) be a measure space and let 1 < p < ~. For a bounded 
operator T on L?(Q) the following assertions are equivalent: 


(1) T is a pointwise multiplier, that is, there exists a function m € L®(Q) such that 
T f =mf for all f € L?(Q); 

(2) T commutes with all pointwise multipliers, that is, TMy = MoT for all @ € L®(Q); 
here, My f = of. 


If Q. = R¢ with Lebesgue measure and 1 < p <~, then (1) and (2) are equivalent to 
(3) TMe, = Me,T forall § € R4 where eg (x) = exp(ix-€) forx € Ré 


Proof tis trivial that (1) implies (2). If, conversely, T commutes with every pointwise 
multiplier, then for all f € L?(Q) NL*(Q) we have 


Tf =T(fl) =TMAl =My/T1 = fT1 = Myf. 
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Hence for all f € L?(Q) NL®*(Q) we have 


IIMrif lly = IIT Fllp < IIT IIIf llo- 


Since L? (Q.) NL®(Q) is dense in L? (Q), this implies that pointwise multiplication by T1 
extends to a bounded operator on L?(Q). This forces T1 € L*(Q) (see the observation 
at the end of Section 2.3.a). Setting m := 71, we obtain T = M,,. This proves that (2) 
implies (1). 

Now suppose that Q = R¢ with Lebesgue measure. It is trivial that (2) implies (3). 
Suppose now that (3) holds, with 1 < p < c. Fix € > 0 and f € L?(R?), and choose 
r > 0so large that || f. — f||p < €, where f, := f\f-nna: If p is a linear combination of 
2r-periodic trigonometric exponentials, (3) implies 


T (Pfr) = pT f,. 


By the Stone—Weierstrass theorem, every @ € C([—r,r]“) can be uniformly approxi- 
mated by linear combinations p, of such trigonometric exponentials. Applying the pre- 
ceding identity to p, and taking limits in L?(R“), we find that 


T(of:)=OThr, 9 €C([-n7\"). 


If @ € C,(R®), this implies that T(¢,f,) = 0,-T f,. As r > 0 we have ¢,f; > of and 
6,T f, + OT f in L?(R®), and therefore 


OTF=Tf), €C,(R*). 


Finally, if @ € L*(IR“), we can find a sequence of functions ¢, € C,(IR“) converging to 
@ pointwise almost everywhere and such that sup, ||@n||.. <9. Since $,T f > OT f 
and 6, f > @f in L?(R“), we conclude that 


T(Of)=OTf, 0 €L*(R*). 


This proves that (2) holds. 


As an application we have the following characterisation of translation invariant op- 
erators on L?(R¢). 


Theorem 5.35 (Translation invariant operators on L?(R¢)). If T is a bounded op- 
erator on TAL #) commuting with every translation, then T is a Fourier multiplier 


operator, that is, there exists a (necessarily unique) function m © L*(R¢) such that 


Tf = F-'(mF f) for all f € L’(R*). 


Proof Using the notation of Lemma 5.34 and letting t, f(x) := f(x+y), easy calcula- 
tions show that Me, F = Fe and Tz F t= F-'Me; for all € € R@ These identities 
imply that the operator T = ¥T.¥~' has the property M. ¢ T= TMe, for all E € RG 
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and the lemma implies that T isa pointwise multiplier. This means that T is a Fourier 
multiplier. 


5.6 The Hilbert Transform 
A case of special interest concerns the multiplier 


m(€) =—isign(€), § ER. 


In order to obtain an explicit representation for the Fourier multiplier operator T,,, we 
observe that m = —i(1p, — Ip_) and consider the functions 


ng (S) = exp(—alg|)Ir.(S), a>. 


Then 
ne (x) = — |" exp(—aé) exp(ix€ )d§ = ! f 
as J2n Jo V/2m a— ix 
and 


1 
ng (x) -= |" exp(ag )exp(ixg )dg = ae 


Formally letting a | 0, in view of Proposition 5.30 one expects that 


T_isienf = —ilim(T,+ f — T,-f) 
a0 a a 


=~ Selim ff) 
ee ae I 1 
aati sa mG) a 


In view of Proposition 5.30, this suggests the formula 


Tisef@)= =f fee oa) 


The above argument is nonrigorous and the convolution with the nonintegrable function 
1/x is not even well defined as an operator acting on L?(IR). The next theorem turns the 
above formal derivation into a rigorous argument. 


Theorem 5.36 (Hilbert transform as a Fourier multiplier operator). The Fourier multi- 
plier operator H := T_jsign is given by 


re iy) 2 
Hf(-)=1 dy, 1?(R), 
fO=lime fa, FEL R) 


el0 7 


the convergence being in the sense of L?(R). 
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1 f(-—y) 
A, f := = dy, 
i ae y i 


we see that He is the operator of convolution with the integrable function $¢(x) = 
Aly x|>e} (x). The Fourier transform of this function can be rewritten as 


BE) =e | cexe(-ind) av 


1 dz 
—— i ] — 
= Pea ed ee gee ee! ; 


Proof Setting 


whe Se : 2 io\ _ fC) 
= —egsien(s) im exp(|é|Re ) exp((élee ) 46, 


where the last step used Cauchy’s theorem applied to the boundary of the part of the 
annulus {€ < |z| < R} in the left half-plane. Now, 


an 3m 

/, exp([é|Re®) d6| < i exp(|6|Rcos 0) dé 

3m am 

An 
a exp(—|§|Rsin@) d0 
0 
_ 1 ~exp(-lE/R) 
\|R/m 

where we used that sin @ > (2/7)@ for @ € {0, 57]. As the expression on the right-hand 
side tends to 0 as R — ©, we find 


<2 |” exp(—(2/n)|6|R0) 40 


~ i 


be(§) = - Ferien) |“ e(|6lee*) dé. 


As € | 0, the integral on the right-hand side tends to 2 for every & € R. Hence by 
dominated convergence 


lim ||: — ( aaqien) 3 = Ll; ie = (IElee’ °) 9-1) dé =0. 
As a result, by using Proposition 5.30 we obtain 

lim, f = V2zlim 6, f = —isign- f 

el0 el0 


with convergence in L?(IR). Therefore, by the Plancherel theorem and the definition of 
A, 


limH, f = (—isign: f) =Hf 
€10 
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with convergence in L?(R). 


Definition 5.37 (Hilbert transform). The operator H = T_jsign is called the Hilbert 
transform. 


The Hilbert transform has deep applications in Harmonic Analysis and the theory of 
partial differential equations. It will resurface in our treatment of the Poisson semigroup 
in Chapter 13. Its connection with harmonic functions is pointed out in Problem 5.22. 

The final theorem of this section gives a characterisation of the Hilbert transform in 
terms of commutation properties. A dilation on L?(R®) is an operator of the form 


Ds f(x) := f(6x), x €R%, 
where 6 > 0. 


Theorem 5.38 (Characterisation of the Hilbert transform). If T is a bounded operator 
on L?(R) commuting with every translation and dilation, then T is a linear combination 
of the identity operator and the Hilbert transform. 


Proof Theorem 5.35 tells us that T is a Fourier multiplier operator, say T = T,, with 
m € L®(R). For 6 > 0, simple calculations give 


TnDsf (x) = Tosmf (6x) 
and 
DsTnf (x) = Tn f (Ox) 


for almost all x € R. It follows that T,,D5 = DT if and only if Tp;m = Tm, that is, if 
and only if m(6&) = m(&) for almost all € € R. This is true for all 5 > 0 if and only if m 
is constant almost everywhere on both R, and R_. Hence m = al + bsign for suitable 
a,b € C. The result now follows from Theorem 5.36. 


5.7 Interpolation 


In general it can be difficult to establish L?-boundedness of operators acting in spaces of 
measurable functions. In such situations, interpolation theorems may be helpful. They 
serve to establish L’-boundedness in situations where suitable boundedness properties 
can be established for ‘endpoint’ exponents po and p, satisfying po < p < py. In typical 
applications one takes po € {1,2} and p; € {2,0} (cf. Sections 5.7.b and 5.7.c). 
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5.7.a The Riesz—Thorin Interpolation Theorem 


Theorem 5.39 (Riesz—Thorin interpolation theorem). Let (Q,.¥, 1) and (Q', F',u’) be 
measure spaces and let 1 < po, P1,90,91 < ~. Let 


Ty : LP°(Q) + L®(Q!), Ts L?1(Q) > L"(Q’) 


be bounded operators which are consistent in the sense that T f := Ty f =T,f p’-almost 
surely for all f € L?°(Q) NL?! (Q). Assume furthermore that 


To fll z40(a”) < Aollfllzro(ay,  f € L?°(Q), 
Ti fllan(ar) <Arllfllzi(ay, ff € LP'(Q). 


Let0 <0 <1 and set 


Po =O PL 
Then the common restriction T of Ty and T, to L?°(Q) L?!(Q) maps this space into 
L498 (Q") and extends uniquely to a bounded operator 


T : LP®(Q) 4 L(Q’) 
satisfying 
IIT fllise (ay) <AQ PATI llzve@y,  f € L?#(Q). 


We begin with a simple lemma, which corresponds to the special case where the 
interpolated operator is the identity operator. 


Lemma 5.40. Let (Q,.7,u) be a measure space, let 1<ro<r<ecand0<6<1, 
and set wo = ve + A Then for all f € L(Q) NL" (Q) we have f € L’®(Q) 


Ifllre < lly “Uli, 
Proof Write | f|"@ = |f|"~®"6| £|9"6 and apply Hélder’s inequality. 


Consider the open strip 
S:={zEC: 0<Rez< ]}. 


Lemma 5.41 (Three lines lemma). Suppose that F :S —> C is a bounded continuous 
function, holomorphic on S satisfying 


sup|F (iv)|< Ao, sup|F(1+iv)| <A1. 
veR veR 


Then F is uniformly bounded on § and for all 0 < 0 < 1 we have 


sup|F(@ +iv)| <Aq PAP. 
veR 
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Proof Let F satisfy the assumptions of the lemma with constants Ag and A. For each 
€ > 0 the function Fe(z) := A 'Ay“ exp(ez(z— 1))F(z) satisfies the assumptions of the 
lemma with constants Ao,e = Ai,¢ = 1. Moreover, lim,-,.. |Fe(u + iv)| = 0 uniformly 
with respect to u € [0,1]. Hence for large enough R we have |F;| < 1 on the boundary 
of the rectangle Rez € [0, 1], | Imz| < R. The maximum modulus principle implies that 
|Fe| < 1 on this rectangle, and by letting R — °° we find that |F-| < 1 on S. The lemma 
now follows by letting € | 0. 


Inspection of the proof reveals that the boundedness assumption on F can be relaxed. 
It cannot be entirely dispensed with, however: the function F(z) = exp(exp((z— 1)) 
is bounded on the lines Rez = 0 and Rez = 1, but unbounded on the line Rez = 7 


Proof of Theorem 5.39 There is no loss of generality in assuming that Ag > O and 
A, > 0. If po = pi = ©, then for all f € L°(Q) we have Tf € L%(Q') NL" (Q’) and 
therefore, by Lemma 5.40, 


1-6 9 ~6 48) ¢1-8}1 ¢//8 _ 41-0 46 
IT fllae SIT Fllg IT Fllg, <A0 PATI Files “Ill =A0 PATI flee. 


This settles that case pp = p; =. In the rest of the proof we may therefore assume that 
min{po, pi} <x. This assumption implies pg < °. 
For z € S define p,q, € C by the relations 


Lo Tae Zz 1 1l-z 2z 
a ae) 


Pe Pe a ae he a 


where go qj, and q, are the conjugate exponents of go, qi, and q-, respectively. Let 
a:Q—C and b:Q' > C be w- and p’-simple functions and define, for each z € S, the 
u- and 1'-simple functions f, € L?°(Q) ML?! (Q) and g, € L%(Q!) AL" (Q’) by 


b(o’) 


al@ ry 
( ) ; 82(0") = Loop O(a IF", a) 


ja(@)| 


f:(@) = Lazo} |a(@)|?/?* (5.6) 


Then T f, € L%(Q') NL (Q’), and the function F : S — C defined by 
FQ) := f (PA) eae 


is easily checked to be bounded and continuous on S, holomorphic on S, and for all 
v € R we have 


; Go/% 
IF (i)| < Aollfirllpllgivll, < Aollallps/” 5130/4" 
and similarly 


. Go/4 
JF(L+iv)| <Arllalibe!?" olf". 
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Hence, by (5.6) and the three lines lemma (Lemma 5.41), 


fo 


(1-6)qy/qh, 0 0dg/¢' 
=|F(@)| < Ay PAP llallp, Ph oI 9" alle?!" lg 
1-06 
=Ay AY llallp lla: 


Taking the supremum over all '-simple functions b € Lo (Q’) of norm at most one, by 
HGlder’s inequality and Propositions 2.26 and 2.28 (here we use the assumption pg < -) 
we obtain 


1-6 48 
IITallag < Ag “Arllallye- 


Since the 1-simple functions are dense in L?(Q), this proves that the restriction of T 
to the 1-simple functions has a unique extension to a bounded operator T from Le (Q) 
into L479 (Q’) of norm at most Aj °A9. 

It remains to be checked that T f = 7 f for all f € L’° (Q). To this end, we may assume 
Po < pi. Selecting y > 0 such that u{|f| = y} = 0 and replacing f by y~'f, we may 
assume that {| f| = 1} =0. Write f = Lys ft+lypayf = : f° +f! and observe that 
fi € LPs(Q) (f= 0,1). If fn > f in L??(Q) with each f, 1-simple, then, with obvious 
notation, fi => flin Li (Q) and therefore T f/ = limy5.0T fi = limy5.0.T fi= =T fin 
L4i(Q'). As a consequence TfhH=TP+Tfi=TfP4+Tfi=rTf. 


Up to this point we have implicitly assumed that the scalar field is complex. Suppose 
now that the scalar field is real. We may extend a bounded operator T : L?(Q) > L4(Q' 
to a bounded operator Tc : L?(Q;C) > L4(Q’;C) by setting 


Tc(ut+ iv) :=Tu+iTv 
for real-valued u,v € L?(Q). The triangle inequality implies the trivial bounds 
IT <lIrell < 217. 
If T is a positivity preserving operator (that is, f > 0 implies Tf > 0), then the identity 


|a+ib|= sup |acos@+bsin@| 
€(0,27] 


(for a proof, rotate the point (a,b) € R? to the positive x-axis) together with the in- 
equality |Tg| < T|g| for real-valued g € L?(Q) (which follows from (2.21)) implies the 
pointwise bound 
\Tc f|=|Tu+iTv|= sup |(Tu)cos@+ (Tv)sin@| 
0€(0,21] 


< sup Tlucosé+TvsinO| < T\ut+iv| =T|f\. 
0¢(0,2z] 
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This implies the norm bound ||Tc|| < ||T|| and hence equality 
[Zell = ITI. 
With some additional work, this equality can be extended to arbitrary bounded operators 
T and exponents | < p < q < ©; by duality and the fact that (Tc)* = (T*)c, this equality 
also holds for exponents 1 < g < p < ~. The proof is based on the observation that for 
allz=a+bie Cand 1 <q < © we have 
1 

IIYlla 


where Y,%1,% are real-valued standard Gaussian random variables defined on some 
probability space Q, with yj, and 7 independent, and E denotes the expectation. Indeed, 
if |z| = 1, then ay + by is another standard Gaussian random variable and 


\zl = ——(Elan + by|*)/4, (5.7) 


Elan + by |9)!/4 = |Ivillg- 


— 


The general case follows from this by scaling. 
Assuming | < p< q < ©, from (5.7) in combination with Fubini’s theorem we obtain 


Ingle +) fageyccy = Blatt fh [tut avi qy 


=|] EljTut pTv|tdu' = IInTut+ pTv||t, 


fou 4(Q!:L4(Q)) 
= q q qd 
= (Tne + wMMraGreany <NTM Ne + ella G.rr(ay) 
$F Fmt Blaze) = = tls es ts (a:c)" 


In the penultimate step we used the continuous ver- 
sion of Holder’s inequality (Problem 2.29). The 
Riesz—Thorin theorem may now be extended to the 
case of real scalars and exponents | < p< q<o 
as follows. Suppose that the assumptions of the the- 
orem are satisfied, except that all spaces are real. 
Apply the Riesz—Thorin theorem to the complex- 
ified operators So := (To)c and S; := (T%)c, we 
obtain bounded operators Sg from L?°(Q;C) to 
L146 (Q/;C) of norm at most Al ®A8. This oper- 
ator maps functions in L?°(Q) 1 L?!(Q) to func- 
tions in L7°(Q"). Since L?0(Q) NL?! (Q) is dense in 
L?6(Q), by approximation it follows that Sg maps 
real-valued functions in L?® (Q) to real-valued func- Felix Hausdorff, 1868-1942 
tions in L479 (Q’). Stated differently, Sg restricts to 

a bounded operator, denoted Tg, from L?°(Q) to 
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49 (Q’) of norm at most Aj °A%. On L?0(Q) ML?! (Q), To coincides with the common 
restriction of 7o and 7;. 

Informally stated, this discussion shows that the Riesz—Thorin theorem extends, with 
the same constant, to the case of real scalars if we assume T to be positivity preserving 
or the exponents satisfy either 1 < pj,qgj <c or l=pj <q; < for j=0,1. 


5.7.b The Hausdorff-Young Theorem 


This brief section and the next are devoted to some applications of the Riesz—Thorin 
theorem. 

As we have seen in Section 5.5, the Fourier transform is bounded from L!(R¢) to 
L®(R¢) and its restriction to L! (IR?) NL? (R¢) extends to an isometry from L7(R“) onto 
itself. The Fourier transform with respect to the normalised Lebesgue measure d(x) = 
(2m) ~4/? dx defined in Remark 5.18 is contractive from L!(IR4,m) to L*(R4,m), and its 
restriction to L!(R4,m) 9 L?(R4,m) extends to an isometry from L7(R4“,m) onto itself 
by Remark 5.28. Accordingly, the Riesz—Thorin theorem implies: 


Theorem 5.42 (Hausdorff—Young). Let 1 < p < 2 and ; + 7 = 1. The restriction to 
L'(R¢) NL? (R®) of the Fourier transform has a unique extension to a bounded operator 
from L?(R¢) to L4(R“). With respect to the normalised Lebesgue measure, the Fourier 
transform has a unique extension to a contraction from L? (R4,m) to L1(IR4,m). 


A similar result holds for the Fourier transform on the circle (see Problem 5.27). 


5.7.c L?’-Boundedness of the Hilbert Transform 


A second application of the Riesz—Thorin theorem is the following theorem due to M. 
Riesz about L?-boundedness of the Hilbert transform. 


Theorem 5.43 (Riesz). For all 1 < p < @ the restriction of the Hilbert transform to 
L? (IR) NL?(R) has a unique extension to a bounded operator on L? (R). 


The proof of Theorem 5.43 is based on a couple of lemmas. 
Lemma 5.44. If f € C!(R), then Hf € L?(R) for all2 < p<. 
Proof Let J be a bounded interval containing the support of f. The pointwise identity 
1 fe f@—y)—fety) 
H, = d 
fe)=5 f : y 
tA ee dat Ace 


1 °o 
= 1 
al (—I4+x)U(+x) (y) 
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implies the bound 


|Hef(x)| <= 2|I]-2[|f'llee, xER. (5.8) 


al- 


As € |. 0, we have H, f + Hf in L?(R) by Theorem 5.36 and, upon passing to an almost 
everywhere convergent subsequence, (5.8) implies that Hf € L*(R). This gives the 
result. 


ner 


Lemma 5.45. The Hilbert transform of a real-valued function u € L? (R) is the unique 
real-valued function v € L?(R) such that the Fourier—Plancherel transform of u-+ iv 
vanishes on R_. 


Proof That the Fourier transform of u+iHu vanishes on R_ is immediate from Theo- 
rem 5.36. In the converse direction, let u,v € L?(R) be real-valued such that the Fourier— 
Plancherel transform of uw + iv vanishes. Then for almost all € > 0 we have 


0=u(—§) +iv(—§) = HS) + iV(E) = WS) — (6), 
so 1(—&) = iu(—€) and v(€) = —ia(§). Hence, for almost all € € R, 
W(G) = —isign(S )u(g). 
By Theorem 5.36 this proves that v = Hu. 


In the next lemma we use that if @ € C2(R), then 6”(E) = —|E|2(E) is bounded, 
and therefore @ is integrable. 


Lemma 5.46 (Cotlar). Let H be the Hilbert transform on L? (R). For all real-valued 
u € C2(R) we have 

(Hu)? = uw? +2H(uHu). 
Proof Let u,v € C2(R) be real-valued functions. By Theorem 5.36 the Fourier trans- 


forms of u+ iHu and v+iHv are integrable and vanish on R_, and by Proposition 5.30 
the same is true for the Fourier transform of 


(u-Hv+Hu-v)+i(Hu-Hv—u-v) = —i(u+iHu)(v+ iv). 
By Lemma 5.45, this implies 
Hu-Hv—u-v=H(u-Hv+Hu-v). 


Cotlar’s identity follows by taking u = v. 


Proof of Theorem 5.43 The proof consists of three steps. First we prove the theorem 
for exponents p = 2” withn € N by Cotlar’s identity, then for 2 < p < ~ by interpolation, 
and finally for 1 < p < 2 by duality. 


Step 1 — In this step we show that if H is L?-bounded for some 2 < p < ~, then H is 
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L??-bounded. The proof also gives a bound for ||H||2, in terms of |||] ,. In what follows 
we set ||H ||, =: cp. 
Let u € C2(R) be real-valued. Then Hu € L??(IR) by Lemma 5.44. By Cotlar’s iden- 
tity and Holder’s inequality, 
[|Hu)" [Ip <e llp + All| p < lu? [Ip + 2cpllull2p||Hull2p.- 
Using the identity ||v?||, = ||v Ip this gives 


2 2 
|u|] < llell2p + 2cpl|ullp||Hullop, 


or equivalently, 


(||Hull2p — epllullop)* < (1 +e) |lull5p- 
It follows that 


||Hullap — epllullap < 1 + clletllop 


| Hullop < (c+ /1 +63 ) llap. 


By considering real and imaginary parts separately, at the expense of an additional con- 
stant 2 this inequality extends to arbitrary u € C2(R). Since C2(R) is dense in L?(R), 
it follows that the restriction of H to C2(R) uniquely extends to a bounded operator on 
L?(R). Obviously, this operator extends the restriction of H to L?(IR)ML?(R). 


Step 2 — Since H is L?-bounded, Step 1 implies that H is L?'-bounded for all n € N. 
The Riesz—Thorin theorem then implies that H is L’-bounded for all 2 < p < ©. 


and hence 


Step 3 — Finally suppose that 1 < p < 2 and let ; + 7 = 2. For f,g € C2(R) one easily 
checks that 


[upge=— [prea 
R R 
and consequently 


| Hf Feel <lflollBlle <callflolell 


where cg = ||H||,. Since C2(R) is dense in L4(IR) by Proposition 2.29, Proposition 2.26 
implies that Hf € L?(R) and ||Hf||p < cq||f||q. This proves that H is L?-bounded, with 
|||» < cq (in fact we have equality here, since we can also apply this argument in the 
opposite direction). 
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5.7.4 The Marcinkiewicz Interpolation Theorem 


In this final section we prove another L?-interpolation theorem, the Marcinkiewicz inter- 
polation theorem. It has the virtue of requiring less stringent conditions at the endpoints 
and the operator to be interpolated does not even need to be linear. On the downside, the 
constant obtained from the proof is rather poor. The theorem elaborates on the observa- 
tion, made after the proof of the Hardy—Littlewood maximal theorem, that the proof of 
the L’-bound essentially only depended on the weak L!-bound. 

By (L?0 + L?!)(IR“) we denote the vector space of all f € Li, (IR) that admit a de- 
composition f = fo + fi with fo € L?9(R®) and f; € L?!(R¢). 


Theorem 5.47 (Marcinkiewicz interpolation theorem). Let 1 < po < p< pi < © and 
suppose that T : (LP° + L?!)(R¢) — (LP° + L?!)(IR¢) is a subadditive mapping in the 
sense that for all f € LPO and g € L?!(R“) we have 


IT(f+8)| <|T(A)| +|T(g)| almost everywhere. 
Suppose furthermore that for j = 1,2 there are constants Cq, pj 0, depending only on 


d and pj, such that 


Hr) >< (SP) Ag, pe LPR, 
if 1 < pi < ©; if p) =~ we replace the assumption regarding p, by 
ITF) loo < Caeoll fle, f € L*(R*). 
Then T maps L?(R®) into L? (R¢) and 
IT Allo <Capllfil, fe L?(R’), 


where Cq,p is a constant independent of f. 


WY 


A weak L1-bound holds if T is L?-bounded in the sense that ||T(f)||7 < Ca.ql|f|lq for 
all f € L4(R¢). 


Proof We give the proof for | < p; < ; the case p; = ~ proceeds along the lines of 
Theorem 2.38, requiring small changes that are left to the reader. Fixing t > 0, we split 
f=fotfi with fo € L?0(R®) and f; € L?! (R¢) by taking 

fo=Uipayat f= Vp. 
From the subadditivity of T we obtain 


{IT(A)| >t} S {IP (Fo)| > t/2} + {IT (Ai) > t/2} 


and therefore 


ITA > tbl <7 (Fo) > 4/231 + {7 (A)| > ¢/2 31. 
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Combining the assumptions with Fubini’s theorem and proceeding as in the proof of 
Theorem 2.38, after some computations we arrive at 


[irmooarsp [er titir(p| > alae 
cof e((S2)" [qlee 
nfo (SB)" fi jg Za) 


<Ch, [Leal 


Problems 


5.1 Let (%n)n>1 be a sequence in a Banach space X with dense linear span. Using 
Baire’s theorem, prove that this linear span equals X if and only if dimX < ©. 
5.2 Using the Baire category theorem, prove that there exists no norm on Li (R%) 
that turns this space into a Banach lattice. 
Hint: Use Theorem 2.57. 
5.3. This problem outlines a proof of the uniform boundedness theorem that does not 
appeal to the Baire category theorem. 
Suppose that (7;);e7 is an operator family such that: 
(i) supje; ||Tix|| =: Cy < ce for all x € X; 
Gi) supje, || Til] =<. 
For n = 1,2,... choose indices i, € J and vectors x, € X such that 
-1 
Till > LL Con +2, [beall < 
=1 


n 
m= 


1 


eee iD 
4.3" 


I| Ti, nll 2 4. 3n 


1 
3n° I|Zi, ||. 


Let x := Ynsi Xn. By writing 


n-1 co 
T;,,.X = > T,.Xm + Ti, Xn ce y T;,,.Xm 
m=1 m=n+1 
and estimating these terms, prove that ||7j,, x,|| > for all n > 1. This contradiction 
proves the result. 
5.4 Let X be the linear span of the standard basis vectors of ( and let P, : (2 + K 
denote the orthogonal projection onto the nth coordinate. Show that nP,x — 0 for 
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5.5 


5.6 
5.7 


5.8 
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all x € X and ||nP,,|| =n — oe. Conclude that the completeness assumption cannot 
be omitted from the uniform boundedness theorem. 

The aim of this problem is to prove that a weakly holomorphic function is holo- 
morphic. Let us start with the definitions of these notions. We fix an open set 
DC Cand a complex Banach space X. A function f : D — X is said to be: 


e holomorphic, if for all z) € D the limit 
fam £2) — Sle) 
220 Z— 20 
exists in X (see Problem 4.2); 


e weakly holomorphic, if for all z € D and x* € X* the limit 


tim (£0 <2. f (Zo) oe 
x0 Z— 2 
exists in C. 


Obviously every holomorphic function f : D— X is weakly holomorphic. 
Now let f : D— X be weakly holomorphic, fix zo € D, and let r > 0 be so small 
that the closed disc {z € C : |z—zo| <r} is contained in D. 


(a) Applying Proposition 5.5 to the set 


1 flzoth)—flzo) flzotg)—f(zo r 

u={ é )~ (zo) _ f(zo +8) ©) igh ial < £} 
h-g h g 2 

and using the Cauchy integral formula for X-valued holomorphic functions 

(see Problem 4.2), prove that there is a constant M > 0 such that for all 

\g|,|2| < r/2 we have 


(eo f(z0 +8) — f (zo) 
h g 


| <min—sl. 


(b) Deduce that every weakly holomorphic function f : D— X is holomorphic. 


State and prove an analogue of Proposition 5.5 for the weak* topology. 
Using the open mapping theorem, show that there exists no complete norm ||| - ||| 
on C[0, 1] with the property that 


\I|fn — fll +0 = fn 2 f pointwise. 


A sequence (Xn)n>2 in X is called a Schauder basis if for every x € X admits a 
unique representation as a convergent sum x = )),,51 CnXn With cy € K for alln > 1. 

Let (Xn)n>1 be Schauder basis in X, and let Y be the vector space of all scalar 
sequences c = (Cp)n>1 such that the sum Yn>i CnXn Converges in X. 


5.9 
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(a) Show that 


N 
c y= sup|| CnX 
llel| SUP Xu nXn 


defines a norm on Y and that Y is a Banach space with respect to this norm. 
(b) Show that c++ Y',51 CnXn 18 an isomorphism from Y onto X. 
(c) Conclude that the coordinate projections 


Py: » CnXpn > Ck 


n>1 
are bounded and that sup;s, ||Px|| <-. 
For f € L!(—a,7) and N € N define the functions 


sy f(t) =p3e f(n)exp(int), t €[—z,z], 


where f(n) is the nth Fourier coefficient of f, that is, 


7" 1 ost 
Fn) = = ie A@expl—ns\ds,, we Z 


By the results of Section 3.5.a, for all f € L?(—2,7) we have 
f= lim swf = Y f(nexp(in(,)) 
neZ 


with convergence in L?(—z,7); the series on the right-hand side is the Fourier 
series of f. One might express the hope that if f € C[—z,7] is periodic, then 
its Fourier series converges to f with respect to the norm of C[—2,z]. The aim 
of this problem is to prove that this is wrong in a strong sense: there exists a 
periodic function f € C[—2,7] which is periodic in the sense that f(—2) = f(z) 
and whose Fourier series oe att =0. 

(a) Show that sy f(t) = 5: J, f(s)Dw(t — s) ds, where the Dirichlet kernel is 


given by 
N ; sin(N +4)t 
Dy(t) := 13 exp(int) = aN Tae, 
n=—N sin(51) 
(b) Show that ||Ay|| = ||Dy||1, where the linear map Ay : Cper[—2, 2] > C is 


given by Ay f := sy f(0). 

Hint: To prove the inequality ||Ay|| > ||Dy||1, approximate sign(Dy) point- 
wise almost everywhere by a sequence of continuous periodic functions f), 
of norm < 1, set gn(t) = fn(—t), and use dominated convergence to obtain 


1 Tw 
lim Ay(gn) )= tim == [sat fr(t)Dy(t) dt = = | !Pwiolar 
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Fill in the missing details. 

(c) Show that limy_s.. ||Dy||1 = ©. 
Hint: Use that | sin(x)| < |x| for all x € R, and then perform some careful 
estimates on the resulting integral. 

(d) Apply the uniform boundedness theorem to prove that sy f(0) 4 f(0) as 
N — for some f € Cper[—2, 7]. 


5.10 Let (a,)n>1 be a scalar sequence with the property that the sum ¥,5) dnb, con- 
verges for all scalar sequences (bn)n>1 satisfying Y,>1 [Pal? <0. 


(a) Show that ¥°,,5; dnbn converges absolutely for all scalar sequences (bn) n>1 
satisfying Dy>1 |bn|* < ©. 
(b) Show that Y,,51 |dn|? <~. 
Hint: Apply the closed graph theorem to the mapping T : 2 — ¢! defined by 
T : (bn)n>1 > (Anbn)n>1- Conclude that (a, ),>1 defines a bounded functional 
on ¢”. 
5.11 Consider a Banach space with a direct sum decomposition X = Xp @ X;. Prove 
that the projections onto the summands define isomorphisms of Banach spaces 


X/Xo~X1, X/X~Xo. 


5.12 Let X be a Banach space with direct sum decomposition X = Xp © X1. Show that if 
To : Xo — Y and 7; : Xj — Y are bounded operators, then the operator T := 7p 6 T, 
from X to Y defined by 


T (xo +x1) := Toxo + Tx 


is bounded. What can be said about the norm of T? 
Hint: First show that |||x9 +1 ||| := ||xo|] + ||1|| is an equivalent norm on X. 

5.13 The aim of this problem is to prove that the set of surjective operators from a 
Banach space X into a Banach space Y is open in #(X,Y). 


(a) Let T € #(X,Y) bea surjective operator. Show that there is a constant A > 0 
such that for all y € Y there exists an x € X such that ||x|| < Ally|| and Tx =y. 

(b) Let T € Y(X,Y) be a bounded operator. Suppose there exists a constant 
A >0Oand0< B < 1 such that for all y € Y with ||y|] < 1 there exists an x € X 
such that ||x|| <A and ||Tx — y|| < B. Show that T is surjective. 

Hint: Look into the proof of Lemma 5.7. 

(c) Show that if T ¢ &(X,Y) is surjective and S € #(X,Y) is a bounded oper- 
ator satisfying ||S|| < 1/A, where A is the constant of part (a), then T + S is 
surjective. 

Hint: Apply the first part with B = A||S|]. 


5.14 Let X be a vector space. 
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(a) Suppose that || - ||; and || - |/2 are two norms on X, each of which turns X 
into a Banach space. Show that if there exists a constant C > 0 such that 
\|x||1 < C]lx|]2 for all x € X, then the two norms are equivalent. 

(b) Find the mistake in the following “proof” that every two norms || - || and || - ||’ 
turning X into a Banach space are equivalent. Define 


[lll == [lal] + [lel x eX. 


This is a norm which turns X into a Banach space and we have ||x|| < |||x\ll 
and ||x||’ < |||x|||. Hence part (b) implies that ||x|| and |||x||| are equivalent and 
that ||x||/ and |||x||| are equivalent. It follows that || - |] and || - ||/ are equivalent. 


5.15 Let (Q,.¥,) be a measure space, let 1 < p < ©, and suppose that f : Q— X is 
a function which has the property that the scalar-valued function @ +> (f(@),x*) 
belongs to L?(Q) for every x* € X*. 

(a) Show that the mapping T : X* + L?(Q) defined by x* +> (f(-),x*) is closed. 
(b) Deduce that there exists a constant C > 0 such that 


KFC) )Ilze(ay S Cll" |], x eX". 
5.16 Let (Q,-F,) be a finite measure space and let X be a Banach space. 
(a) Let f : Q—X be a strongly measurable function. Show that if there is an 
exponent 1 < p < such that (f(-),x*) € L?(Q) for all x* € X* then there 


exists a unique element x € X, the Pettis integral of f with respect to u, 
such that 


(ry2") = [ (f(@),x")du(o), eX" 


Hint: The integrals fo 14) <n}, dp are well defined as Bochner integrals. 
(b) Show that the result of part (a) fails for p = 1. 

Hint: Let (An)n>1 be a sequence of disjoint intervals of positive measure |A;| 

in the interval (0, 1) and consider the function f : (0,1) — co defined by 


=a 


n>1 


14, (ten, t € (0,1), 
|An| 


where (é,)n>1 is the sequence of standard unit vectors in co. 
5.17 Write out a proof of Theorem 5.32. 
5.18 Forn >and f € L?(R®) let 


1 


FoK8) = are J yy RPC EM) Oe 


Show that ¥,, maps L?(IR“) into itself and defines a bounded operator on L?(R“), 


206 


5.19 


5.20 


5.21 
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and show that for all f € L?(IR“) we have the following identity for the Fourier— 
Plancherel transform: 


F f= lim Fnf, 


where the limit is taken in L7(R7). 
It was shown in Lemma 5.20 that for A > 0 the Fourier transform of g(x) = 
(2n)~4/2 exp(— 5A|x|?) is given as 


ma = 1 
HE) = (20d)-4 exp(—S 18 7/2). 
Give an alternative proof of this identity by completing the following steps: 
(a) it suffices to prove the identity for A = 1 and in dimension d = 1; 
(b) the function u(x) := (2)~4/? exp(— 5x") solves the differential equation 
u' (x) +xu(x) =0; 
(c) the Fourier transform of u also satisfies the differential equation; 
(d) apply the Picard—Lindel6df theorem (Theorem 2.12). 
Consider the Fourier-Plancherel transform F : f 4 f on L?(R¢), 


(a) Show that ¥? = R, where Rf(x) := f(—x) is the reflection operator on 
L?(R¢). 
(b) Deduce that .#* = 1. 
Prove that the Hilbert transform H on L?(R) satisfies H7 = —/. 
This problem establishes a connection between the Hilbert transform and the the- 
ory of harmonic functions. 
For real-valued functions f € L?(IR) we define ur : R x (0,00) + R 


ur(x,y) = py * f(x), xER, y>0, 
where 


Py(x): ee 


= —-+—, ER, y>0, 
Te+y’ > Bd 


is the Poisson kernel. 


(a) Show that ws is harmonic, that is, u € C?(IR x (0,2°)) and Au = 0. 

(b) Show that uy + iugy is holomorphic. 
Let f € L!(R) satisfy f(—€) = 0 for almost all € > 0. 

(a) Show that for all y > 0 the function P, « f, where P, is the Poisson kernel, is 


integrable and its Fourier transform belongs to L! (IR) L?(R). 
Hint: Compute the Fourier transform of P,. 


5.24 


5.25 
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(b) Using Fourier inversion, prove that the function 
(x+y) = Py f(x) 


is holomorphic on the open half-plane {Imz = x+iy > 0}. 
(c) Using Proposition 2.34, show that 


te a) —fOleay = 0: 


Somewhat informally, part (c) states that every L'-function whose Fourier trans- 
form vanishes on the negative half-line is the boundary value (in the L'-sense) of 
a holomorphic function on the upper half-plane. 


(d) State and prove a version of part (c) for the disc. 


We use the notation introduced in Problem 2.27. Let ¥ be the Fourier—Plancherel 
transform on L?(R“) and let X be a Banach space. On the space L”(R“) @X we 
define the linear operator F @/ by 


(F@eNFOx):=(Fflex, fel (R’), xEX. 


(a) Show that this operator is well defined. 

(b) Show that if X = 2? with 1 < p< ©, then ¥ @/ extends to a bounded oper- 
ator on L?(R; ¢”) if and only if p = 2. 
Hint: For 1 < p < 2 consider the functions 


N 
fu = Y f(- +220) ® ens, 
n=0 


where (é,)n>1 is the sequence of standard unit vectors in 2? andO 4 f € 
C,(R) has support in the interval (—, 2); for 2 < p < © use the functions 


N 
iy:=¥ e"O FQ en4t. 
n=0 


Let 1 < p< o. Young’s inequality implies that the convolution of a function f € 

L'(R¢) with a function g € L?(R“) belongs to L?(R®) and || f * gl|p < |Ifllillgllp- 

(a) Write out the proof of this result obtained by taking r = 1 in the proof of 
Proposition 2.33. 


The special case of Young’s inequality just stated can be reformulated as saying 
that for every g € L? (R¢) the convolution operator C, : f > f * g is bounded from 
L!(R®) to L?(R%) with norm 


Cell gzip) ze (Ra)) S llsllp- 
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(b) Let ; + = 1. Using Hélder’s inequality, show that the restriction of C, to 
L'(R¢) NL4(R4) extends uniquely to a bounded operator from L1(R“) to 
L*(R¢) of norm 


IICell (capa) c= (R4)) < |lgllp- 
(c) Use the Riesz-Thorin interpolation theorem to obtain an alternative proof of 
the general form of Young’s inequality. 


5.26 Let (Q,.%,u) be a measure space and let 1 < p,q < © satisfy 7 + : = 1. The 
aim of this problem is to use the Riesz—Thorin interpolation theorem to derive the 
Clarkson inequalities: 


(1) if 1 < p< 2, then for all f,g € L?(Q) we have 
(If + alle +f — lB)? <2!/P( FB + lige) !/? 
and 
(lf + all + lf — sllf)'/4 <2'/4( F118 + llgll)!/?: 
(2) if 2 < p<, then for all f,g € L?(Q) we have 
(if +8ll2 +Ilf—sll2)/? < 21/4( 


p+ lall2)'/” 


ff 


and 


(If +all2 +If—sll2)/? <2"? F 118 + llellf)'/ 


Let 1 < r,s < oe. On the cartesian product L’(Q) x L’(Q) we consider the norm 


(Fg) les = (IS + ell). 


(a) Show that the resulting normed space X,.,(Q) is complete. 


On X,.;(Q.) we consider the operator 


T: (fg) > F+8,f-—8). 
Its norm will be denoted by ||T||;.s- 
(b) Show that ||7'| 1,1 = 2 and ||7||22 = V2. 
(c) Deduce the first inequality in (1). 
(d) Show that for all f,g € L'(Q) we have ||T(f,g)||1.0 < ||(f,g)ll1a- 
(e) Deduce the second inequality in (1). 
(f) Prove the inequalities in (2). 
(g) Prove that L?(Q) is strictly convex, that is, ||/||p = ||g||p = 1 with f 4 g 
implies ||3(f+8)llp <1. 
5.27 State and prove an analogue of the Hausdorff—Young theorem for the circle. 
5.28 Write out the details of the proof of the Marcinkiewicz interpolation theorem for 
the case p; =. 


6 
Spectral Theory 


Spectral theory is the branch of operator theory that seeks to extend the theory of eigen- 
values to an infinite-dimensional setting. Much of its power derives from the observa- 
tion that, away from the spectrum of a bounded operator T, the operator-valued function 
A ++ (A —T)~! is holomorphic. This makes it possible to import results from the the- 
ory of functions into operator theory. For instance, the fact that bounded operators on 
nonzero Banach spaces have nonempty spectra is deduced from Liouville’s theorem, 
and the Cauchy integral formula can be used to introduce a functional calculus for func- 
tions holomorphic in an open set containing the spectrum of 7. 


6.1 Spectrum and Resolvent 


In Linear Algebra, a complex number A is said to be an eigenvalue of an (n x n) matrix A 
with complex coefficients if there exists a nonzero vector x € C” such that Ax = Ax. The 
number A is an eigenvalue if and only if AJ —A fails to be invertible, or equivalently, 
if and only if det(AJ— A) = 0. Writing out the determinant we obtain the so-called 
characteristic polynomial in the variable 7, which has n zeroes (counting multiplicities) 
by the main theorem of Algebra. Our first task will be to investigate to what extent these 
results generalise to bounded operators acting on a Banach space. 

Throughout the chapter, T denotes a bounded operator acting on a complex space X. 
We work over the complex scalars; this convention will remain force throughout the rest 
of this work. 


Definition 6.1 (Resolvent and spectrum). The resolvent set of an operator T € 2(X) 
is the set p(T) consisting of all A € C for which the operator AJ — T is boundedly 
invertible, by which we mean that there exists a bounded operator U on X such that 


(AI—T)U =U(AI-T) =1. 
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The spectrum of T is the complement of the resolvent set of T: 
o(T):=C\ p(T). 
From now on we shall write A — T instead of AJ — T. It is customary to write 
R(A,T):=(A=T)! 


for the resolvent operator of T at the point A € p(T). By the open mapping theorem 
(Theorem 5.8), a complex number / belongs to p(T) if and only if A — T is a bijection 
on X. 


Example 6.2. The spectrum of an (n x n) matrix with complex coefficients, viewed as 
a bounded operator acting on C”, equals its set of eigenvalues. 


In the present setting, a complex number A is said to be an eigenvalue of the bounded 
operator T € £(X) if Tx = Ax for some nonzero vector x € X; such a vector is then said 
to be an eigenvector. The set 0p(T) of all eigenvalues of T is called the point spectrum 
of T. If A is an eigenvalue of T, then A — T is not injective and therefore not invertible. 
As a result, eigenvalues belong to the spectrum. In contrast to the finite-dimensional 
situation, however, points in the spectrum need not be eigenvalues: 


Example 6.3. The right shift T on (?, given by the right shift 
T : (c1,€2,---) > (0,¢1,¢2,..-) 

has no eigenvalues. Indeed, the identity Tc = Ac can be written out as 
(0,¢1,C2,...) = (Aci,Ac2,...). 


If A £0, comparison of the entries of these sequences inductively gives c, = 0 for all 
n> 1.If A =0, the identity reads (0,c1,c2,...) = (0,0,...) and again we obtain c, = 0 
for all n > 1. In both cases we find that Tc = Ac only admits the zero solution c = 0. 

The spectrum of T equals the closed unit disc: o(7) = D. This can be proved directly 
(see Problem 6.4) or by the following argument based on results proved below. By 
Proposition 6.18 we have o(T) = o(T*). The adjoint operator T* is readily identified 
as the left shift (c1,c2,...) + (c2,c3,...). For each A € C with |A| < 1, the element 
(1,4,A7,...) € @ is an eigenvector for this operator with eigenvalue /. It follows that 
D C o(T). Since by Lemma 6.7 the spectrum of a bounded operator is closed, this 
forces D C o(T). On the other hand, by Lemma 6.6, the fact that T is a contraction 
implies that o(T) C D. 


Remark 6.4. In contrast to the case of matrices in finite dimensions, the existence 
of a left inverse does not imply the existence of a right inverse and vice versa. For 
example, the right shift in ¢? has a left inverse, namely by the left shift, but not a right 
inverse; similarly the left shift in @ has a right inverse, namely the right shift, but not 
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a left inverse. This is the reason for insisting on the existence of a two-sided inverse 
in Definition 6.1. If both a left inverse U; and a right inverse U; exist, then necessarily 
U, = U, and this operator is a two-sided inverse. 


If the bounded operators T and U are boundedly invertible, then so is TU and 
(TO) 2p t. 


Invertibility of TU by itself does not imply invertibility of T or U; a counterexample is 
obtained by taking T and U be the left and right shift on ¢?. However we do have the 
following result. 


Lemma 6.5. The product of two commuting bounded operators is invertible if and only 
if each of the operators is invertible. 


Proof The ‘if’ part is clear. For the ‘only if? part, suppose that TU = UT is invertible. 
The invertibility of TU implies that T is surjective and U is injective, and likewise the 
invertibility of UT implies that U is surjective and T is injective. It follows that both T 
and U are bijective, and by the open mapping theorem their inverses are bounded. 


Our first main result, Theorem 6.11, asserts that the spectrum of a bounded operator 
acting on a nonzero Banach space is always a nonempty compact subset of the complex 
plane. This will be deduced from a series of lemmas. 


Lemma 6.6 (Neumann series). /f ||T|| < 1, then I—T is boundedly invertible and its 
inverse is given by the absolutely convergent series 


co 


(@-T) '=) 7". 


n=0 
As a consequence, the spectrum of a bounded operator T is contained in the closed disc 


{z€C: || <||TI[}- 


Proof The absolute convergence in “&(X) of the series follows from 79 ||T”|| < 
Yo || ||" < ee. By the completeness of 2(X), the series YT” converges in @(X). 
The identity in the statement of the lemma is a consequence of the identities 


N N 
(-T)} 1" =) 1" -T) =1- 7%", 
n=0 n=0 


valid for all N > 1. Upon letting N — © they give 
G-T) Er’ = Vr t-T)=1, 
n=0 n=0 


which means that the bounded operator )\_,. T” is a two-sided inverse for J—T. 
To prove the second assertion, let T € (X) be an arbitrary bounded operator. By 
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the first assertion, for all A € C with A > ||T|| the operator A —-T =A(I—T/A) is 
boundedly invertible. 


As an application we prove that the spectrum is always a closed subset of C. 


Lemma 6.7. The spectrum o(T) is a closed subset of C. More precisely, if A € p(T), 
then p(T) contains the open ball with centre A and radius r = 1/||R(A,T)||. If|A—b| < 
6r withO < 6 <1, then 


1 
R(T) < 7s IRATII- 


Proof If S € &@(X) is boundedly invertible and U € @(X) has norm ||U|| < dr with 
r:=1/||S7"||, then ||S~'U|| < 6 < 1 and therefore, by Lemma 6.6, S—U = S(I— S~!U) 
is boundedly invertible and 


1 1 


7 ra — = — re n LO on 
ISU) < js e—s'vy- | =|19-t] Eotuy|] <tr t 
n=0 n=0 


Lemma 6.8 (Resolvent identity). For allA,u € p(T) we have 


Proof Multiply both sides with the invertible operator (u—T)(A —T). 


Definition 6.9 (Holomorphy). Let © be an open subset of C. A function f :Q— X is 
holomorphic if for all zo € Q the limit 


fim £@)— Se) 
2% Z— 20 
exists in X. 


Some properties of Banach space-valued functions have already been explored in 
Problems 4.2 and 5.5. 


Lemma 6.10. The function 2 > R(A,T) is holomorphic on p(T) and satisfies 
lim ||R(A,7)|| =0. 
|A|+e 
Proof Continuity of the mapping A +> R(A,,T) follows from the resolvent identity and 


the bound in Lemma 6.7. To prove the holomorphy of A +> R(A,T) on p(T) we use the 
resolvent identity and the continuity of A ++ R(A,T) to obtain 


fim RP)- RUT) _ 
Hoa A-w 


lim R(A,T)R(u,T) = —(R(A,T))’. 
oa 
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For |A| > 2||T'|| the Neumann series gives 


RAT) =ALNWE-AT) I <|Art YP ATT" < ay 


n=0 


This proves the second assertion. 
We are ready for the first main result of this chapter: 


Theorem 6.11 (Non-emptiness of the spectrum). Jf T is a bounded operator on a 
nonzero Banach space, then o(T) is a nonempty compact subset of the closed disc 


{A EC: |A| <|IT II}. 


Proof Containment in {A € C: |A| < ||7||} and closedness of the spectrum have al- 
ready been proved in Lemmas 6.6 and 6.7, respectively. Since bounded closed subsets 
of C are compact, this gives the compactness of o(T). 

Suppose, for a contradiction, that o(T) = @. Then the function A +> R(A,T) is holo- 
morphic on C. By Lemma 6.10 it is also bounded. Now we are in a position to apply 
Liouville’s theorem: for all x € X and x* € X* we find that A > (R(A,T)x,x*) is con- 
stant. Its limit for |A| — is zero, and therefore (R(A,T)x,x*) =0 for all A € p(T) and 
all x € X and x* € X*. By the Hahn—Banach theorem, R(A,T)x = 0 for all A € p(T) and 
all x € X. It follows that R(A,T) =0 for all A € p(T). This implies X = {0}. 


Instead of using duality to reduce matters to scalar-valued functions one may note that 
the proof of Liouville’s theorem generalises mutatis mutandis to holomorphic functions 
with value in a Banach space. 

We have seen in Lemma 6.10 that the resolvent A > R(A,T) is holomorphic on p(T). 
The next result shows that the topological boundary p(T) := p(T) \ p(T) is a natural 
barrier for holomorphy for this function. 


Proposition 6.12. If A, — A in C, with each A, € p(T) and A € Op(T), then 
lim ||R(An,7)|| ==. 
Proof By Lemma 6.7, if u € p(T), then the open ball B(u;||R(u,7)||~!) is con- 


tained in p(T). This implies the more precise assertion that for all u € p(T) we have 
d(u,o(T)) > IR(u,7) |", that is, ||R(Wt,7)|| > 1/d(u,0(7)). 


An immediate application is the following analytic continuation result. 


Corollary 6.13. Jf D C C is a connected open set intersecting the resolvent set of a 
bounded operator T, and if > R(A,T) extends holomorphically to D, then D C p(T). 


A more substantial application of Proposition 6.12 is the following result about the 
spectra of isometries. Recall that an isometry is an operator T € &(X,Y) such that 
|| Tx|| = ||x|| for all x € X. 
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Corollary 6.14. The spectrum of an isometry is either contained in the unit circle T or 
else equals the closed unit disc D. 


Proof First note that an isometry T has norm ||7'|| = 1, so that o(T) C D by the second 
assertion of Lemma 6.6. 
If u € p(T) with 0 < |u| < 1, then, using that T is an isometry, for all x € X we find 


lel = (CH — T)R(H,T) al] > ||TR,T)-x|| — UR, Tal] = (1 |e) RT| 


and therefore ||R(u,T)|| < 1/(1— |u|). In view of Proposition 6.12, this implies that 
for allO0 <r <1 the disc {u € C: |u| < r} does not contain boundary points of o(T). 
This being true for all 0 < r < 1 it follows that either the open unit disc D is contained 


in o(T) or else D contains no points of o(T). In the former case we have o(T) = D, as 
o(T) is closed and contained in D; in the latter case we have o(T) C T. 


This result is the best possible in the following sense: both the closed unit disc and 
any nonempty closed subset of the unit circle can be realised as the spectrum of suitable 
isometries. For instance, the left and right shift on ¢ provide examples of the former, 
and if K C T is a nonempty closed set, then on C(K) the operator (T f)(z) = zf(z) on 
L’(T) is easily verified to have spectrum equal to K. 

We have the following continuity result for spectra: 


Proposition 6.15 (Lower semicontinuity of the spectrum). Let Q be an open set in the 
complex plane containing o(T). Then there exists a 6 > 0 such that if the bounded 
operator T' satisfies ||T —T'|| < 6, then o(T’) CQ. 


Proof In the proof of Lemma 6.10 we have seen that lim),)_,.. ||R(A, T)|| = 0. In par- 
ticular, sup),).9)7) ||R(A,T)|| < ee. By the continuity of A ++ R(A,T) we also have 
SUPy\A)<a\\7I)}n{a.¢Q} ||R(A,T)|| < cc. Combining these, we find that 


sup ||R(A,T)|| <=. 
A¢Q 


Denote this supremum by M. If ||T — T’|| < 1/M and A ¢ Q, then from 
A-T' =(A-T)I+R(A,T)(T -T’')] 


we infer that A — T’ is invertible, noting that 7+ R(A,T)(T —T’) is invertible since 
|R(A,T)(T-T")|| <M-1/M=1. 


It has already been noted that eigenvalues belong to the spectrum, but that a bounded 
operator need not have any eigenvalues (see Example 6.3). We now prove a useful result 
that makes up for this to some extent. 


Definition 6.16 (Approximate eigenspectrum). The number A € C is called an approx- 
imate eigenvalue of the operator T if there exists a sequence (X,)n>1 in X with the 
following two properties: 
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(1) |lxn|| = 1 for all n > 1; 
(2) ||Tx,—Axn|| > 0 as n> o, 


In this context the sequence (x;),>1 is called an approximate eigensequence for A. The 
set of all approximate eigenvalues is called the approximate point spectrum. 


Every eigenvalue is an approximate eigenvalue. Approximate eigenvalues belong to 
the spectrum, for if A were an approximate eigenvalue belonging to the resolvent set of 
T, we would arrive at the contradiction 


1 = ||xn|| = ||RA,T)(Txn —Axn)|| < ||R(A,T)||||Lxn — Axn|| + 0 asn > o, 
Proposition 6.17. The boundary of o(T) consists of approximate eigenvalues. 


Proof If A € do(T), then there exists a sequence (A,)n>1 in p(T) converging to A. 
Using Proposition 6.12 and the uniform boundedness principle, we find a vector x € X 
such that ||R(An,T)x|| — °°. The vectors 
R(An, T)x 
GS 
|RAn, 7) 

then define an approximate eigensequence: this follows from 
ICP = An) + An =A) RAn T)x|| — bl 


TXn —AXp|| = < 
I I [RO, Tx iRA,,T xl 


+ |Ayn —A| + 0. 


We have the following duality result: 
Proposition 6.18. The spectrum of the adjoint of a bounded operator T equals 
o(T*)=o(T). 
Proof IfA € p(T), then 
(A-T*)[RA,T)P =[RA,T)(A-T)P = Ty = Tes, 
and similarly [R(A,T)]*(A — T*) = Ix«, from which it follows that A € p(T*) and 
R(A,T*) = [R(A,T)]* This proves the inclusion o(T*) C o(T). 

To complete the proof we show that p(7*) C p(T). Applying what we just proved to 
T* we obtain p(T*) C p(T**). Fix A € p(T*). Identifying X with its natural isometric 
image in X** (see Proposition 4.21), we wish to prove that the restriction of R(A,T**) 
to X maps X into itself. This restriction will be a two-sided inverse for A — T, proving 
that A € p(T). 

By Proposition 5.15 the range R(A — T) is dense in X. Moreover, for all x € X we 
have 

x=R(A,T™)(A-T™)x = R(A,T™)(A -T)x. 
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This shows that R(A,T**) maps the dense subspace R(A — T) of X into X, and therefore 
it maps all of X into X, by the boundedness of R(A,T**) and the closedness of X in X**. 
The restriction of R(A,T**) to X is therefore a bounded operator on X, which is a left 
inverse to A — T. It is also a right inverse, since in X** we have the identities 


(A=T)RO, PF) x= (A—T*)RA,T™)x =x. 


This proves that A — T is invertible, with inverse R(A,T**)|x. Hence p(T*) C p(T), or 
equivalently o(T) C o(T*). 


We finish this section with an observation about relative spectra. Let o/ C 2(X) bea 
unital closed subalgebra, that is, Y is a closed subspace of &(X) closed under taking 
compositions and containing the identity operator /. For an operator T € / we define 
P(T) as the set of all A € C for which the operator A — T boundedly invertible in 
&@, that is, there exists an operator U € & such that (A -T)U =U(A—T) =I. We 
further set O,v(T) :=C\ p(T). By redoing the proofs of Lemmas 6.6 and 6.7, p(T) 
is an open set and 6,,(T) is a closed set contained in the closed disc of radius ||T||, and 
therefore 0,7(T) is compact. 

It is evident that p.v(T) C p(T) and therefore 


o(T) C Ow (T). 

The next result provides a partial converse. 

Proposition 6.19. Let <&/ C £(X) be a unital closed subalgebra and let T € &. Then 
do4(T) CO(T). 


Proof Let A € do(T) and let A, > A in C with each A, in p.y(T). By redoing 
the proof of Proposition 6.12 we obtain that ||R(An,T)|| — co as n > o». Since p(T) is 
open and the resolvent A ++ R(A,T) continuous with respect to the operator norm, this 
implies that A € o(T). 


The example of the left shift 7 on X = (7(Z) and the unital closed subalgebra 
generated by the identity and the right shift, shows that this result is the best possible: 
here one has do,v(T) = o(T) =T and o,(T) =D. 


6.2. The holomorphic Functional Calculus 


If f :C > C is an entire function and T is a bounded operator on X, we may define a 
bounded operator f(T) on X as follows. Writing f as a convergent power series about 
z=0, 


oo f(a) 
f(2) = yY f _ zt 


no «(7 
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we define 


This series converges absolutely in “(X) since the same is true for the power series of 
f(z) for every z € C. The mapping f +> f(T) is called the entire functional calculus of 
T and has the following properties, each of which is a consequence of the corresponding 
properties for scalar-valued entire functions: 


(i) if f(z) =z" withn EN, then f(T) =T"; 
(ii) f(T) a(T) = (F8)(T); 
(iti) g(F(T)) = (go f)(7). 


This calculus may be used to define operators such as exp(T), sin(T), cos(T), and so 
forth. There is a beautiful way to extend the entire functional calculus to a larger class 
of holomorphic functions, namely by replacing power series expansions by the Cauchy 
integral formula 


sae adn chy 
flo) = 55 [goa (6.1) 


Here, f is assumed to be a holomorphic function on an open set Q in the complex plane 
containing zo, and is a suitable contour winding about Zp in Q. Formally substituting T 
for zo and interpreting 1/(A — T) as R(A,T), one is led to conjecture that a holomorphic 
functional calculus may be defined by the formula 


Ar)= a [ F(A)R(A,T) dA. (6.2) 


Since A +> R(A,T) is continuous on p(T) with respect to the operator norm of .7(X), 
after parametrising I the integral in (6.2) is well defined as a Riemann integral with 
values in £(X) (see Section 1.1). 

In order to flesh out a set of conditions on ©, I; and T to make this idea work we 
first take a closer look at the precise assumptions in the Cauchy integral formula (6.1). 
These are that Q is an open set in the complex plane containing zp and Tis a piecewise 
continuously differentiable closed contour in Q \ {zo} with the following two properties: 


(i) the winding number of I about the point zo equals 1; 
(ii) the winding number of TF’ about every point in CQ equals 0. 
Here, the winding number of T about a point z is the (integer) number 


1 1 
w(Tsz0) *= ra eae dd. 
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More generally we can admit finite unions of such contours, as long as their union 
satisfies (i) and (ii). If fT = 1 U--- UT; is such a union, we define 


1 1 


j=l Tj A—2z2 


di. 


Condition (ii) is satisfied if Q = Q; U---UQ, is a finite union of disjoint convex sets 
and I; is a piecewise continuously differentiable closed contour in Q; \ {zo}. 

Turning to the discussion of (6.2), we need to fix a similar set of assumptions re- 
garding Q and I while letting the operator T take the role of zo. We require that Q is 
an open set in the complex plane containing o(7) and I is a piecewise continuously 
differentiable closed contour in 2 \ o(T) with the following two properties: 


(i) the winding number of I about every point z9 € o(T) equals 1; 
(ii) the winding number of I about every point in CQ equals 0. 


If the conditions (i) and (ii) are met we say that I is an admissible contour in Q for 
o(T). As before we admit the possibility that I is a finite union of such contours; for 
such contours we interpret (6.2) as 


AT a Ef rae 


For example, if o(T) = K, UKz is the union of two disjoint compact sets with K 7 CQ;, 
where Q := Q) UQz is a disjoint union of open convex sets, we may select contours 
I; with winding number | about every point in K; and consider l =I UI) as an 
admissible contour in Q for T. This example is relevant for defining spectral projections, 
where one uses the holomorphic functions f = 19, and f = 1g, (see Proposition 6.25). 

For an open set Q in the complex plane we denote by H(Q) the vector space of all 
holomorphic functions on Q. 


Theorem 6.20 (Holomorphic functional calculus). Let Q C C be an open set containing 
o(T). For functions f € H(Q) we define 


1 
= sai | ORO T)dA 


where T is an admissible contour in Q for T. The resulting operators f(T) are well 
defined and have the following properties: 


(i) the operator f(T) is independent of the admissible contour T; 
(ii) for entire functions the holomorphic and entire functional calculi agree; 
(iii) for fu(A) = 1/(u—A) with pw € p(T) we have fu(T) = R(u,T); 
(iv) for all f,g © H(Q) we have f(T)g(T) = (F@)(T); 
(v) for all f © H(Q) we have f(T*) = (f(T))*. 
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Tt) T 


Figure 6.1 The contour P =T, UT) around o(T) = K, UTy in Q=Q,UQ, 


Proof (i): This follows by applying the Cauchy theorem to the scalar-valued integrals 


wai [0X R(A,T)x,x*) da 


and then using the Hahn—Banach theorem. Alternatively one may extend mutatis mu- 
tandis the proof of the Cauchy theorem to functions with values in a Banach space. 


(ii): For f,(z) =z" with n € N we have, with Ta circle of radius r > ||T'|| oriented 
counterclockwise, 


sa [ TRO, T)dA 
ar 1 kk n—1-k k n 
is ye Tedd = ¥ (sri fara) rt = 3 


Here we first used the Neumann series for R(A,7) = A~!(1—A7'T)~! (noting that 
|A| > ||7'|| for A € T), then we interchanged integration and summation (which is justi- 
fied by the absolute convergence of the series, uniformly in A € T), and finally we used 


that 
1 1, j=-l; 
sa faiaay ! 
ani Jr 0, jeZ\{-l}. 
By linearity, this proves that the holomorphic calculus agrees with the entire functional 
calculus for polynomials. The general case follows by approximating an entire function 
by its power series and noting that this approximation is uniform on bounded sets. 


(iv): Let and I” be contours surrounding o(T) in Q, with I” to the interior of T 
(more precisely, the outer contour I’ should have winding number one with respect to 
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every point on the inner contour I’). By the resolvent identity and Fubini’s theorem, 


ATT) = aw | [ FA)eu)RA.TIR(U.T)auad 
~ oan ff te g(u)(A— mM) "[R(u, 7) —R(A,T)|duda 
= otae | [pee wn(a— wR. Taman 


= Come o(wu)(A—m) R(T) ad dy 


fH) a (WR, T) du = (fg)(7). 


== 


Here we used that 
1 
sq [sw wy tay =0 ana = f pay —n) tad = FW). 
7 JT! 
(iii): The identity (u—A)- fu(A) = fu(A)-(u—A) = 1 gives, via (ii) and (iv), that 
(u—-T)fu(T) = fu(T)(u—T) =F. It follows that f(T) = R(u, 7). 
(v): This follows from Proposition 6.18 and the continuity of the mapping St> S$", 
which allows one to ‘take adjoint under the integral sign’. 


Theorem 6.21 (Spectral mapping theorem). Let Q C C be an open set containing o(T). 
For all f € H(Q) we have 


Proof We begin with the proof of the inclusion o(f(T)) C f(o(T)). Fixa ¢ f(o(T)); 
our aim is to show that A ¢ o(f(T)). Let U be an open set containing the (compact) 
set f(o(T)) but not A. Then Q! := QM f~!(U) is an open subset of Q containing o(T) 
anda ¢ f(Q'). 

The function f is holomorphic on Q/, and so is g4(z) = (A — f(z))7|, by the choice 
of Q!. By the multiplicativity of the holomorphic functional calculus applied to H(Q’), 


(A—f(T))8a(T) = ga(T)(A-f(T)) = 


from which we infer that A ¢ o(f(T)). 

Turning to the converse inclusion f(o(T)) C o(f(T)), let A € Q. By the theory of 
functions in one complex variable we have f(A) — f(z) = (A —z)h(z) for some h € 
H(Q), so f(A) — f(T) = (A-T)A(T). If A € o(T), then A —T is noninvertible. Since 
i —T and hy (T) commute, Lemma 6.5 implies that f(A.) — f(T) is noninvertible and 
therefore f(A) € o(f(T)). 


Theorem 6.22 (Composition). Let Q C C be an open set containing o(T), let f € 
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H(Q), and let Q!' C C be an open set containing o(f(T)). Then for all g € H(Q') we 
have 


s(f(T)) = (go f)(T). 
Proof Let” be an admissible contour in Q! for o(f(T)), and let Q be an open subset 
of Q containing o(T), chosen in such a way that I” has winding number 1 about every 


point of f(Q). Let I be an admissible contour in Q for o(T). " 
If p is a point on I’, then hy (A) := (u—f(A))~! defines a function in H(Q) and 


hy (T)(M— f(T) = (H- F(T) ag (1) =F. 
It follows that u € p(f(T)) and 


RU F(T)) = ha (T) = 5 [ (W= FA)) RAT) AA. 


Hence, using Fubini’s theorem to justify the change of order of integration, 
(F(T) = = [| wR. F(T) ae 
= (55) [, [ews tra, r) ara 
= (x5) [ [swe —sayytRa, 7) ayaa 
= 5; | 8(f())RA,T)aa = (eo f)(7). 


In the penultimate identity we used that +4, fp g(u)(u — f(A)! du = g(f(A)) by the 
Cauchy integral formula. 


We shall present two applications of the holomorphic calculus. The first is an exact 
formula for the spectral radius 


r(T) :=sup{|A|: A € o(T)} 


of a bounded operator T (with the convention sup @ = 0 to deal with the trivial case 
X = {0}). Since the spectrum o(T) is contained in the closed disc with radius ||T|| we 
have r(T) < ||T||. More generally, by the spectral mapping theorem, r(T)"” = r(T") < 
|||, so r(T) < ||7"||!/". We actually have equality: 


Theorem 6.23 (Gelfand). For every bounded operator T we have 
r(T) = inf ||77||!/" = lim ||7"||!/". 
n>1 n—yco 
Proof Existence of the limit is part of the assertion. Set r(7) := inf,s1 seis esta We 


claim that limsup,,_,.. ||7”||!/" < (7). Once this is proved, the existence of the limit 
limy so ||7”||!/" as well as its equality to inf,>  ||T7”||!/” are immediate consequences. 
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Fix € > 0 and choose k > 1 such that ||7*||!/* < r(T) +. Pick n > 1 and write 
n= ayk+ by with a, > 0 and 0 < b, < k— 1 integers. Then ||T"|| < acral so 


Henle < |irein/eyir el < (er) + eyhen/n rene 


As n—> 00 we have kan /n — 1 and b,/n — 0 and find limsup,,_,..||7”||!/" < r(T) +e. 
Since € > 0 was arbitrary, this proves the claim. 

We have already observed that r(7) < limy-4.. ||7”||!/". On the other hand, if we fix 
€ > 0 arbitrary and let denote a circular contour with radius R = r(T) +, then 


1" = — | A"R(A,T) dA 
implies 


|" < 55 :2AR-R! sup RAT) 


the supremum being finite in view of the continuity of A > R(A,T). Taking nth roots 
and passing to the limit n > c», this implies lim, 5.0 ||7”||!/" << R= r(T) +. Since 


€ > 0 was arbitrary, this completes the proof. 


As an application we have the following sta- 


bility result. 
Theorem 6.24 (Lyapunov’s stability theorem). ( Y 
IfA € L(X) is a bounded operator with o(A) C GN. | 
{ze C: Rez <0}, then ae , 
lim ||e’4|| =0 
too 


Proof Since o(A) is compact, there is a 6 > 0 
such that o(A) C {ze C: Rez < —6}. By the 
spectral mapping theorem (Theorem 6.21) this 
implies that o(e4) C {2 € C: |z| < e~*}. Stated 
differently, we have r(e4) < e~® < 1. By Theo- 
rem 6.23 there exists an integer np > 1 such that 
le" || < 1, Set Mo := |le"*4]). 

By estimating the power series defining e", ||e*4|| < e°ll4ll for all s > 0. Let nowt > 0 
and write t = kno +r with k € N and r € [0,n9). Then 


Israel Gelfand, 1913-2009 


Ile" |] = IJekr04 or | < Ie704 ||kerllAll < Mékeroll4ll, 


Since Mo < | this gives the result (with exponential rate), for t — co implies k > o». 
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The interpretation of this theorem is as follows. Consider the initial value problem 


u(0) = uo, 


where A € #(X) is bounded and uo € X is given. By differentiating the power series 
defining e’4 we see that this problem is solved by the function t +> e'4ug. Lyapunov’s 
theorem now gives the following sufficient spectral criterion for stability of this solution: 
if the spectrum of A is contained in the open left-half plane, then lim,_,.. |e’4|| = 0. 

Another interesting application of the holomorphic calculus arises when the spectrum 
is the disjoint union of two nonempty disjoint compact sets. 


Theorem 6.25 (Spectral projections). Suppose that o(T) is the union of two nonempty 
disjoint compact sets K, and Kz, Q; and Q2 are disjoint open sets containing K, and 
Ko, respectively, and let 1 and Y> be admissible contours for Ky and Kz in Q, and Q2, 
respectively. The operators 


1 , 
a aai RTA, 7€(,2}, 


are projections, their ranges X, and Xp are invariant under T, and we have a direct sum 
decomposition X = X, ® X2. Moreover, 


o(T|x;) = Kj, 7€ {1,2}. 


Proof To see that P; is a projection we just note that Pj = f;(T), where fj : Q; UUQ2 > 
Cis the holomorphic function which is 1 on Q; and 0 elsewhere. By the multiplicativity 
of the holomorphic calculus, P; is bounded and 


P= (A(T) =F (1) = f(T) =P. 
Also, fi + f2 = 1 implies that P; + P, = 7. We further have 
AP = fil(T) f(T) =(fifh)\(T) =0 


since f) f2 = 0, and similarly P,P; = 0. Consequently, R(P;) M R(P2) = {0}, for if x = 
Pix, = Pox, then Pix = P, P)x2 = 0 and P)x = P)P,x, = 0 and therefore x = (Pi + P2)x = 
0+0=0. It follows that we have a direct sum decomposition X = R(P;) @ R(P2). Since 
T obviously commutes with P;, it follows that T maps X; = R(P;) into itself. It follows 
that T restricts to a bounded operator T; := T|x, on Xj. 

Let p € C\ K;. We show that p € p(T7;). Define 


1 
Su: 


aS, 5, = —1 
a ani J A) R(A,T) dA, 
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where I; is a contour surrounding K; in Q; with p on the exterior. Writing u — 7; = 
(u—A)+(A —T;) and using that TPjx = T;P)x for all x € X, we find 


1 ox 
(UT) SuPix = Su(H— TP = 5 f (WA) IR(A,T)(u—T) Pix 
J 
te _4)-1p. 
== [RA TIRE A) 'PixdA 
_! or ee ee 
= ami |, ROT )P van = Pex = Py, 


which shows that (uu — 7j)Sy = Su(u — Tj) =I on R(P,). This proves that yu € p(Tj). 
We have shown that o(7;) C Kj. In particular this implies that o(79) N.0(T1) = ©. 
Next we claim that o(T) C o(T,)Uo(7)); this concludes the proof since it gives 


K,UK2 =o (T) Cc 0(T;)Uo(T) C Ki UK 


and therefore equality holds at all steps. To prove the claim it suffices to note that if u ¢ 
o(T;)Uo(71), then u € p(T) N p(T) and R(u, 71) @ R(U, 72) is a two-sided inverse to 
uU-T=(u-T)O(u—-N). 


Problems 


6.1 Prove the following improvement to Lemma 6.6: if T € (X) is such that the sum 
Yo 7"x converges for all x € X, then J — T is invertible. What is its inverse? 
6.2 Show that if A,u € p(T) satisfy |A — | < dr withO < 6 < 1, then 


|R(,7)— RAT) <-> sIR(A,T I 


6.3 Prove in an elementary way, by multiplying power series, that e”" e? = ew+2)T 


for all complex numbers w,z € C. 
6.4 For 1< p< consider the right shift operator T, € 2(€?): 


y bib (a1,a2,a3,...) > (0,a1,a2,...). 


Give a direct proof of the fact (see Example 6.3) that o(T) = {A €C: |A| < 1}. 
6.5 We compute the spectra of some multiplier operators. 


(a) Let m € C[0, 1] and define T,,, € &(C[0, 1]) by 
(Inf )(t)=m@)f), f €C[0,1], t € [0,1]. 


Show that o(T,,) coincides with the range of m. 


6.6 


6.7 


6.8 


6.9 


6.10 


6.11 


6.12 


6.13 
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(b) Let m € L”(0, 1), let 1 < p< %, and define T,, € @(L?(0, 1)) by 


(Inf (t) =m(t) f(t), f €L?(0,1), t € (0,1). 


Show that o(7,,) coincides with the essential range of m, that is, the set of 
all A € K such that for any open set U C K containing A the set {t € (0,1): 
m(t) € U} has positive measure. 

Hint: First show that a. is essentially bounded if and only if A is not con- 
tained in the essential range of m. 


Let Tin be the Fourier multiplier on L7(IR®) with symbol m € L®(R¢). 


(a) Show that o(T,,) equals the essential range of m. 
Hint: Use the result of the preceding problem. 

(b) Show that if f is a holomorphic function defined on an open set containing 
o(Tn), then fom € L*(R4) and Tyom = f(Tm), the latter being defined by 
the holomorphic calculus. 


Show that every nonempty compact subset of C occurs as the spectrum of some 
bounded operator. 

Show that if P,Q are projections on a Banach space X satisfying ||P — Q|| < 1, 
then dim(R(P)) = dim(R(Q)) (admitting the possibility co = oo), 

Hint: The invertibility of J— (P — Q) implies POX = P(IT— P+ Q)X = PX. 

Let F be the Fourier-Plancherel transform on L?(R¢). 


(a) Recalling that + =/ (see Problem 5.20), apply the spectral mapping theo- 
rem to see that o(.F) C {£1, ti}. 

(b) Show that (.F —il)(F +1)(F +il)\(F —1) =O and (F+1)(F+il)(F —- 
T) £0, and deduce that i € o(.F). 

(c) Prove that o(.F) = {£1,+i}. 


Determine the spectrum of the Hilbert transform H on L?(R¢). 

Hint: Use the result of Problem 5.21. 

Suppose that we have a direct sum decomposition X = Xo © Xj. Prove that if 
T € £(X) leaves both Xo and X; invariant, then 


0(T) = 6(T |x.) UO(T|x;), 


viewing T|x, and T|x, as bounded operators on the Banach spaces Xo and X\. 
Let S and T be bounded operators on a Banach space X. Prove that 


o(ST)\ {0} = o(TS) \ {0}. 


Hint: Use the Neumann series to relate the resolvents of ST and 7'S. 
Prove the claims about the example below Proposition 6.19. 
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6.14 The aim of this problem is to prove Chernoff’s theorem: If T is a bounded operator 
on a Banach space X satisfying sup,ey ||T”|| =: M <e, then for all n € N we have 
llexp(a(T —1))x—T"x|| < Yam ||Tx— xl). 
(a) Show that 


oo Lk 
= n k 
llexp(n(T —D)x—T'xl] <e" YT Phx 7" 


2 co nky 1/2 pnky 1/2 
<e"M||Tx ID (4) (=) In — Kd. 
ak ! 
(b) Show that 


co Lk 
n 
—(n—k)* =ne". 


(c) Combine (a), (b), and the Cauchy—Schwarz inequality to complete the proof. 
6.15 Let T be a power bounded operator on a Banach space X, that is, T is invertible 
and sup;<z ||T*|| <0. 


(a) Using the holomorphic calculus and its properties, explain that we can define 


the bounded operators S := —ilogT and sin(nS), n € N. 
(b) Show that sin(nS) = +(7"—T~"), n EN. 


(c) Using the spectral mapping theorem, show that o (nS) = o(sin(nS)) = {0}. 
We now use that if Y2_9 c.z* denotes the Taylor series of the principal branch of 
arcsinz at z= 0, then cy, > 0 for all k € N and Ye cx = arcsin(1) = a 


(d) Show that nS = arcsin(sin(nS)), where the latter is again defined by means 
of the holomorphic calculus, and deduce that 
Xu 


IInS|l < 5 sup ||7" ||. 
keZ 


Conclude that S = 0 and T = eS =/. 


7 


Compact Operators 


This chapter studies the class of compact operators. By definition, these are the operators 
that map bounded sets to relatively compact sets. Examples include integral operators 
on various Banach spaces of functions over a compact domain. Because of this, com- 
pact operators have important applications in the theory of partial differential equations 
and Mathematical Physics. After establishing some generalities we prove the Riesz— 
Schauder theorem, which asserts that the non-zero part of the spectrum of a compact 
operator is discrete and consists of eigenvalues. 

The final section of this chapter presents an introduction to the theory of Fredholm 
operators. These are the operators that are invertible modulo a compact operator, and 
their degree of non-invertibility is quantified by the so-called Fredholm index. As an 
example we prove the Gohberg—Krein—Noether theorem, which states that a Toeplitz 
operator with continuous zero-free symbol is Fredholm and its index equals the negative 
winding number of their symbol. 


7.1 Compact Operators 
Let X and Y be Banach spaces. 


Definition 7.1 (Compact operators). An operator T € &(X,Y) is compact if it maps 
bounded sets to relatively compact sets. 


Since every bounded set in X is contained in a multiple of the unit ball By = {x € 
X : ||x|| < 1}, a bounded operator is compact if and only if TBy is relatively compact. 
Furthermore, using that a subset of a Banach space is relatively compact if and only if 
it is relatively sequentially compact, a linear operator T is compact if (Txn)n>1 has a 
convergent subsequence for every bounded sequence (x; )n>1 in X. 

The set of all compact operators is a linear subspace of (X,Y). It is clear that cT 
is compact if T is compact, for any scalar c € K, and if S and T are compact, then also 
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S+T is compact: for if SBy and TBy are contained in the compact sets K and L, then 
(S+T7)Bx is contained in the compact set K + L (this set is the image of the compact set 
K x L under the continuous image (x1,x2) + x1 +x2 from X x X to X). It is also a two- 
sided ideal in #(X,Y), in the sense that if T € &(X,Y) is compact and S € #(X',X) 
and U € ¥(Y,Y’) are bounded, then UTS € £(X',Y’) is compact. Indeed, if C is a 
bounded set in X’, then S(C) is bounded in X, so TS(C) is contained in a compact set K 
of Y, and then UTS(C) is contained in the compact set U(K). 


Example 7.2. As an immediate corollary to Theorem 1.38, the identity operator on a 
Banach space X is compact if and only if X is finite-dimensional. 


Example 7.3. A bounded operator is said to be of finite rank if its range is finite- 
dimensional. Since bounded sets in finite-dimensional spaces are relatively compact, 
every finite rank operator is compact. 


Example 7.4 (Integral operators on C(K)). Let pt be a finite Borel measure on compact 
metric space K and let k : K x K > K be continuous. Then the operator T : C(K) > 
C(K), 


xi ff Mey) f y)du(y), feEC(K),xeK, 


is well defined and bounded by Example 1.30. Let us show that T is compact. Let 
(fn)n>1 be a bounded sequence in C(K). We claim that the bounded sequence (T fn)n>1 
is equicontinuous. Once we have shown this, the Arzela—Ascoli theorem (Theorem 
2.11) implies that this sequence is relatively compact, hence has a convergent subse- 
quence. This implies that T is compact. 

To check the equicontinuity we first note that K x K is compact and hence k is uni- 
formly continuous, in the sense that given any € > 0 we can find 6 > 0 in K such that 
d(x,x’) + d(y,y’) < 6 implies |k(x,y) — k(x’,y’)| < €. Then, if d(x,x’) < 6, 


IT fulx)— TInt) < fke.y)— KU y)I|fa(0)] dU) < EMER), 


where M = sup, ||fn||oo. The equicontinuity follows immediately from this. 


The next proposition shows that the compact operators form a closed subspace in 
-£(X,Y). This subspace will be denoted by .% (X,Y) and we write % (X) := 4 (X,X). 


Proposition 7.5. [f lim, |Z, — T || = 0 with each T,, compact, then T is compact. In 
particular, uniform limits of finite rank operators are compact. 


Proof For any € > 0 we can choose an index ng > 1 such that ||T,,, — T|| < €. Since 
T,Bx is relatively compact and TBy C T,,Bx + B(0;€), the relative compactness of 
TBy follows from Proposition 1.40. 
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The following converse of the second assertion of Proposition 7.5 holds for Hilbert 
space operators: 


Proposition 7.6. Let H and K be Hilbert spaces. An operator T € £(H,K) is compact 
if and only if it is the uniform limit of finite rank operators. 


Proof It remains to prove the ‘only if’ part. Let T be compact, say TBy C C with 
C C K compact, where By is the open unit ball of H. Fix an arbitrary € > 0 and let 
B(y1;€),---,B(yw;€) be an open cover of C. Let Y denote the linear span of {y1,...,yv} 
and let P be the orthogonal projection in K onto Y. This projection is of finite rank, and 
therefore PT is of finite rank. For any x € H of norm ||x|| < 1 we have Tx € C, so 
||Tx —yn|| < € for some 1 <n < N. Then, noting that Py, = y, and using that ||P|| < 1, 


|x —PTx|| < || Tx—ynl] + llyn — PTx|| < €+ ||PQn— Tx)|| < E+ |lyn — Txl| < 2e. 


Taking the supremum over all x € H with ||x|| < 1 we obtain ||T — PT || < 2e€ 


Example 7.7 (Integral operators on L?(K,)). Let w be a finite Borel measure on a 
compact metric space K and let k: K x K — K be square integrable. Then the operator 
T : L?(K) 3 L?(K), 


x)= | key f0)dud), FEL(K), xeK, 


is well defined and bounded by Example 1.30. Let us prove that T is compact. 

Fix € > 0. Since C(K x K) is dense in L?(K x K, x 1) we may choose k € C(K x K) 
such that ||K — k||2 < €. Since «k is uniformly continuous we can find 6 > 0 such that 
| (x,y) — K(x’, y’)| < © whenever d(x,x’) +d(y,y’) < 6. Starting from a finite cover of 
K by open balls with diameter at most 56, we can write K = B, U---UB, with disjoint 
Borel sets of diameter at most 56. Set 


n 
=} K( (xj, YK) )1p; xBeo 
jk=1 


where x; € B; and y, € By are chosen arbitrarily. For this function we have || K—Kllo <E 
Then, ||K —k||2 < eu(K x K)!/? = eu (K) and hence 


[Jk — lla < ||k— Kl] + || — Kl|2 < €(1 + (K)). (7.1) 


The integral operator with kernel k, which we denote by T, is given explicitly as 


n 


T- 2: 62 K (x,Yk) H(Be)) 15. 


This shows that T is of finite rank and therefore compact. By (1.4) (with k replaced by 
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k- k) and (7.1) we have 
| —T|| < ||k—kl]2 <e(. +H (K)). 
Since € > 0 was arbitrary this proves that T can be approximated in the operator norm 
by compact operators. 
We conclude this section with a duality result for compact operators. 
Proposition 7.8. An operator T © (X,Y) is compact if and only if its adjoint T* € 
-L(Y*,X*) is compact. 


Proof First we prove the ‘only if’ part. Let T be compact and let K denote the closure 
of TBx. By assumption, K is a compact subset of Y. By restriction, every y* € Y* 
determines a function in C(K) given by y*(y) := (y,y*) for y € K. Moreover, if (v7) n>1 
is a bounded sequence in Y*, the corresponding functions are uniformly bounded and 
equicontinuous; the latter follows from | (x—.x’,y;)| <M ||x—x'|| with M := sup, ||y;||. 
Hence, by the Arzela—Ascoli theorem, there is a subsequence (Yn, j>1 such that 


rs II¥ny — Ynjllec) = 9. 


Then, 
oe IT" yn, — T "Yn, ll am ue. | (x, T "Yn, — T"Yn,)| 


= lim sup |(7x,Yng —Ynj)| < lim _||7'|l|l¥n, —Ynjlleo = 0- 
Pk |x <1 aa 

Thus we have shown that (T*y;,) ;>1 has a convergent subsequence. It follows that T* is 

compact. 


To prove the ‘if’ part suppose that T* is compact. Then, by what we just proved, 
T** is compact as an operator from X** to Y**. Identifying X with a closed subspace 
of X** in the natural way, the restriction of T** to X maps X to Y and equals T. Since 
the restriction of a compact operator is compact, the compactness of 7** implies the 
compactness of 7. 


7.2 The Riesz—Schauder Theorem 


As we have seen in earlier examples, the spectrum of a bounded operator need not 
contain eigenvalues. This is in sharp contrast to the situation in finite dimensions, where 
the spectra of matrices consist of eigenvalues. The aim of the present section is to show 
that compact operators spectrally resemble matrices to some degree. The main result is 
Theorem 7.11, which shows that the nonzero part of the spectrum of a compact operator 
is discrete and consists of eigenvalues. 
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Lemma 7.9. [fT € £(X) is compact, then: 


(1) NU —T) is finite-dimensional; 
(2) R(II—T) is closed. 


Proof (1): Let By denote the unit ball of Y := N(/—T). We have Ty = y for all y € Y, 
so the compactness of T implies that By = TBy is relatively compact. By Theorem 1.38, 
this implies that Y is finite-dimensional. 


(2): Let Y := N(I—T) and consider the linear mapping S : X /Y + X defined by 
S(x+Y):=(U-T)x, xe X. 


Then S is well defined and bounded as a quotient operator. We claim that there exists a 
constant c > 0 such that 


|S(x+Y)|| > cllx|], xeX. (7.2) 


If this were false we would be able to find elements x, € X satisfying ||x, +Y|| = 1 and 
|S(%, +Y)|| < 1/n. Then (I— T) xp = S(a.+Y) — 0. By the compactness of T we can 
find a subsequence such that Tx, — xo for some xp € X. Then, 


dim xn, = jim [7 — T )Xn, + TXn,| = 0+ x0 = Xo. 


By the boundedness of J — T, 
(1-—T)xo = lim —T) xy, = jim S(Xn, +Y) = 0, 
—>00 


k- 00 
and therefore x9 € N(J— 7) = Y. The contradiction 0 = ||xo + Y|| = limp. ||Xn, + Y|| = 
1 concludes the proof of (7.2). 

By Proposition 1.21, (7.2) implies that S is injective and the range R(S) of S is closed. 
Finally, R(S) = RU — T) and therefore the range of J — T is closed. 


In order to describe the spectra of compact operators we need the following lemma. 


Lemma 7.10. Let T € £(X) be a compact operator. Then I —T is injective if and only 
if 1 —T is surjective. 


Proof We begin with the proof of the ‘only if’ part. We assume that J — T is injective 
but not surjective and deduce a contradiction. 

By assumption, X; := R(/ —T) is a proper subspace of X and by Lemma 7.9 this 
subspace is closed. It is clear that TX, C X;. Let 7; denote the restriction of T to X,. 
This operator is compact. 

We claim that J — T is not surjective. Suppose the contrary. Then J — J; is injective 
and surjective, hence invertible, and since for all x € X we have (J— T)x € X, it follows 
that x = (I— 7) ~'(I—T x € X,. This implies that Xj = X. Then / —T is surjective after 
all. This contradiction proves the claim. 
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The same argument shows that X27 := R(/ — 7;) is a proper closed subspace of X;. It 
is clear that TXz C X>. Let Tz denote the restriction of T to X. Continuing as above we 
obtain a strictly decreasing sequence of closed subspaces X; 2 X2 2 ... each of which 
is 7-invariant, such that 


Xng4i = (I-T)Xn, n=1,2,... 
By Lemma 1.39 we can select vectors Xn € Xn \ Xn+1 of norm one such that 
1 
inf —~yl| > =. 73 

yet len yl 5 (7.3) 
Since T is compact, (Txn)n>1 has a convergent subsequence (Tx, )x>1. Then, for 2 > k, 
1 
2 ? 
where we use (7.3) along with (JT — J) xn, € Xn,41 and TxXp, € Xnp C Xn, +1. This contra- 
dicts the convergence of (Txn, )k>1- 


\|T Xn, fat Txn,|| = |X, + (T = 1)Xn, = TXn,|| 2 


Turning to the ‘if’ part, assume that J — T is surjective. If 7*x* = x* for some x* € X*, 
then by writing an arbitrary x € X as (I— T)y, we find (x,x*) = (y,(I—T*)x*) = 0 for 
all x € X, so x* = 0. This shows that J— T* is injective. Applying the preceding step 
to T* it follows that J— T* is surjective as well. Then from the preceding argument, 
applied to 7%, it follows that J — T** is injective, and hence J — T is injective. 


Theorem 7.11 (Riesz—Schauder). Let T € £(X) be a compact operator. Then: 
(1) every nonzero i € 6(T) is an eigenvalue of T and the eigenspace 
Ey, :={xEX: Tx=Ax} 


is finite-dimensional; 
(2) for every r > 0, the number of eigenvalues satisfying |A| > r is finite; 
(3) if dim(X) =», then0 € o(T). 


Proof (1): Let0#A € o(T) and suppose that A is not an eigenvalue of T. Then 
I —A7'T is injective and hence, by the preceding lemma, surjective. It follows that 
I—A~'T is invertible, and this implies that A € p(T). 

Since T acts as a multiple of the identity on the subspace E, and T is compact, the 
identity operator on E, is compact. By Theorem 1.38, this implies that Fy is finite- 
dimensional. 


(2): Suppose there is a sequence of distinct eigenvalues A, all of them satisfying 
|An| > r. Let x, € X be eigenvectors for A, of norm one. 

Let Y,, denote the linear span of {x),...,%,}, > 1, and set Yo := {0}. By the distinct- 
ness of the eigenvalues, the vectors x, are linearly independent (this is easily proved 
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with induction on n). Therefore dim(Y,,) =n. In particular, Y, is a proper subspace of 
Yn+1- For y € Yn, say y = Li_ cjxj, we have 


n 
Ty= Y eA} EY, 
j=l 


and 
(An —Py= Yin Aj)xj = Yo Aj )xj © Yn-1- 


Lemma 1.39 shows that eH every n > | it is possible to find a vector y, € Y, of norm 
one such that ||y — y,|| > I holds for all y € Y,_1. Arguing as in the proof of Lemma 
7.10, for n > m these vectors satisfy the lower bound 


|7 yn — TYm|| = ||AnYn + (7 — An) yn — Tym|| 
Tym —(T —An)¥n 
An 
using that Tym € Yn C Y,—1 and (T —An)¥n € Yn—1, and therefore (Ty, )n>1 cannot have 
a convergent subsequence. This contradicts the compactness of T. 
(3): If T is invertible, then T~!By is bounded and By = T(T~'By) is relatively 
compact. Therefore X must be finite-dimensional. 


> 4/An| 2 4 


For compact normal operators on a Hilbert 
space, a more direct proof is sketched in Prob- 
lem 9.2. 


Definition 7.12. The number dim(£;) is called 
the geometric multiplicity of A. 


Suppose now that T € (X) is compact and 
that 0 4 A € o(T). Then A is an isolated point 


of o(T), that is, for small enough r > 0 we have 4 
B(Asr)No(T) = {A}. WithO <r <randP= 
OB(A,r’) oriented counterclockwise, the spec- Juliusz Schauder, 1899-1943 
tral projection corresponding to A (see Theorem 

6.25) is given by 


1 


By Theorem 6.25 the range Xj, := P,X is invariant under T and o(T|x,) = {A}. In 
particular, T|x, is invertible. This operator is also compact as an operator on Xj. This is 
only possible if dim(X, ) is finite. Thus we have proved: 
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Corollary 7.13. Let T be a compact operator on a Banach space X. For all nonzero 
A € o(T) the range X, of the spectral projection P,, is finite-dimensional. 


Definition 7.14. The number dim(X, ) is called the algebraic multiplicity of 1. 


Proposition 7.15. Let T € &(X) be a compact operator. Then the geometric multiplic- 
ity of every nonzero A € o(T) is less than or equal to its algebraic multiplicity. 


Proof If Tx = Ax and T is as before, then Pyx = (se Ir (u —A)-'du)x = x. This 
shows that the eigenspace E, is contained in the range of the projection P,. 


Example 7.16 (Jordan normal form). Consider a k x k Jordan block 


A 1 O :+ 0 
0 Al 
J, = 0 
Sy 
0 O A 
and its resolvent 
(A-p)y" (A-p)? ee Aft) 
0 (A-mu)" (A=)? 
R(U An) = 
(A—p)? 
0 Shs 0 (A-p)7! 


If A is a matrix with A € o(A) and if J, is the corresponding block in the Jordan normal 
form of A, it follows that the spectral projection corresponding to A is given by 


1 
y= sap | R(u-AaA =i, 


where J, is the diagonal matrix with 1’s on the diagonal entries corresponding to the 
Jordan block J, and 0’s elsewhere. It follows that the algebraic multiplicity v, of A 
equals k, the dimension of the Jordan block. 


7.3 Fredholm Theory 


Throughout this section we let X and Y be Banach spaces. 
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7.3.a The Fredholm Alternative 


From the results in the preceding section we know that if T is a compact operator on a 
Banach space X, then every nonzero A € o(T) is an eigenvalue and the corresponding 
eigenspace is finite-dimensional. The next theorem (when applied to the compact oper- 
ator A~!T) asserts that the dimension of the eigenspace is equal to the codimension of 
the range of A — T. This generalises the elementary result in Linear Algebra that for a 
d x d matrix A we have dim N(A) + dimR(A) = d. 


Theorem 7.17 (Fredholm alternative). [fT € &(X) is compact, then 
dim N(J— 7) = codimR(J—T). 


This theorem contains Lemma 7.10 as a spe- 
cial case. The proof of the theorem is based on 
the following geometric lemma. Recall that if 
Y C X*, then 


ty ={xeX: (x,x*) =0 forall x* EY}. 


Lemma 7.18. Jf Y is a finite-dimensional sub- 
space of X*, then +Y has finite codimension in X 
and codim(+Y) = dimY. 


Proof Let xj,...,x7 be a basis of Y and con- 
. . d 
sider the mapping from X to K®, Ivar Fredholm, 1866-1927 


wx ((x,x7),-.-,(4,29)). 
We claim that this mapping is surjective. Indeed, if € € K¢@ is such that y(x)-& =0 
for all x € X, that is, yes (xX) 5; = 0 for all x € X, then yy Six; = 0 and therefore 
& = 0 by linear independence. This proves the claim. As a consequence there exist 


x; © X such that y(x;) = e;, the jth unit vector of K¢. The resulting sequence x1,...,xg 
has the property that 


(Xi, x7) = Oi; ij= 1,...,d. 


The vectors x; are linearly independent, for if Yio c jx; =0, then cy = (4 CjXj,X,) = 
0 for all k = 1,...,d. 

Now suppose that an arbitrary x € X is given and set x := x — eae cjxj;, Where cj = 
(x,x}). Then (x,x¢) = (x,x¢) — ck = 0 for all k= 1,...,d, soxe +Y. This shows that 
X = (+Y)+Xo, where Xo is the linear span of x),...,xg. If x € (+Y) Xo, then we can 
write x =) ae c jx; Since x € Xo, and we have cx =p ie cj(xj,xX,) =O forallk=1,...,d 
since x € TY. It follows that x = 0. Thus we obtain the direct sum decomposition X = 
(+Y) @ Xo, and therefore codim(+Y) = dim Xo = d = dim(Y). 
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Proof of Theorem 7.17 We begin by recalling that, by Lemma 7.9, N(/ — T) is finite- 
dimensional and R(J — T) is closed. Hence Proposition 5.15, applied to / — T, implies 
that R(J — 7) = +(N(J—T*)) and therefore, by Lemma 7.18 (which can be applied 
because Lemma 7.9 applied to the compact operator T* gives dimN(/—T*) <), 


codim R(I — T) = codim(+(N(J—7*))) = dimN(I— 7%). 


Thus it remains to prove that d:= dimN(/ —T) = dimN(J— T*) =: d* 
Step 1 — We first prove that d* < d. Reasoning by contradiction, suppose that d* > 


d. Since N(J —T) is finite-dimensional, by Proposition 4.16(1) we have a direct sum 
decomposition 


X =N(I-T)@Y 


for some closed subspace Y of X. Also, since R(J — T) is closed and has finite codimen- 
sion d*, by Proposition 4.16(2) we have a direct sum decomposition 


X=R(I-T)®Z (7.4) 


for some closed subspace Z of X of dimension d*. Since d < d* there is an injective 
linear map L : N(I—T) — Z that is not surjective. Set S:= T +Lo7, where 7 is the 
projection in X onto N(J—T) along Y. Since L is a finite rank operator, it is compact 
and hence also Lo z is compact. 

We claim that N(J— S$) = {0}. Indeed, if Sx = x, then 


0=Sx-x= Tx-—x+Lnx 
=—aT lo” 


ERUI-T) €Z 


and therefore (7.4) implies Tx — x = 0 and Lax = 0. The first of these identities means 
that x € N(/—T), so mx = x, and then the second of these identities takes the form 
Lx = 0. The injectivity of ZL then implies that x = 0. This proves the claim. 

By Lemma 7.10, R(J — S) = X. To arrive at a contradiction, let z € Z\ R(L) and 
choose x € X such that x — Sx = z. Then 


x—Tx—-Laix= z 
Sa lo er” —l(sCo—‘“ 
eRU-T) €Z 


and by (7.4) this implies x — Tx = 0 and z = Lax. The second of these identities contra- 
dicts our assumption that z € Z\ R(L). 


Step 2 — Having established that d* < d, we now prove the opposite inequality d < d* 
by a duality argument. Setting d** := dimN(/ —7**), applying Step 1 to the compact 
operator T* gives d** < d* Identifying X with a closed subspace of X, T** is an exten- 
sion of T and therefore d < d*. 
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7.3.b Application to Integral Equations 
As an application of the foregoing theory we turn to the problem of finding a function 


u € C[0, 1] solving inhomogeneous integral equations of the form 


Au(s) = f(s) + [Hsu dt, s€(0,1). (Hf) 


Here f € C(0, 1] is given, k : [0, 1] x [0,1] > K is continuous, and A is a nonzero scalar. 
Under a solution of this equation we understand a function u € C(0, 1] satisfying (Hy) 
for all s € [0, 1]. In order to study existence of solutions it is useful to also consider the 
homogeneous equation corresponding to f = 0, 


Au(s) = ih 'Ka,tuldt, 6 [0,1], (Ho) 


as well as the ‘dual’ homogeneous problem 


1 
Av(s) =f k(t,sv(t)dr, 8 € [0,1 (He) 
0 
Solutions to these problems are defined in the same way. 


Theorem 7.19 (Fredholm alternative for integral equations). Let k : [0,1] x [0,1] > K 
be continuous and let 1 4 0 be fixed. Then: 


(1) if the homogeneous problem (Ho) has no nonzero solution, then for all f € C[0, 1] 
the inhomogeneous problem (Hy) has a unique solution u in C(O, 1]; 

(2) if the homogeneous problem (Ho) has a nonzero solution, then it has at most finitely 
many linearly independent nonzero solutions, and for a given f € C{0,1] the inho- 
mogeneous problem (Hf) has a solution if and only if 


1 
| f(t)v(t) dt =0 
0 
forally €L! (0,1) satisfying the dual homogeneous problem (Hg). 
Proof By the result of Example 7.4 the operator T : C[0, 1] + C[0, 1], 


1 
Tu(s):= i k(s,t)u(t)dt, s € [0,1], 
0 
is compact. Using this operator, the above problem can be abstractly formulated as 
(A-T)u=f. 


If the homogeneous problem (A — T)u = 0 has no nonzero solution, then A is not an 
eigenvalue. Since we are assuming that A 4 0 it follows that A ¢ o(T) by Theorem 
7.11. Therefore, A — T is invertible and the inhomogeneous problem (A — T)u = f is 
uniquely solved by u= (A —T)~'f. 
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If the homogeneous problem (A — T)u = 0 has a nonzero solution, then A is an eigen- 
value, in which case the space of solutions u equals the eigenspace corresponding to A, 
which is finite-dimensional. In that case, the inhomogeneous problem (A —T)u = f has 
a solution uw € C[0, 1] if and only if f € R(A —T) =+N(A —T*); here we use Propo- 
sition 5.15 along with the fact that A — T has closed range. Stated differently, problem 
(A —T)u = f has a solution u € C[0, 1] if and only if 


(f,x")=0, x*eN(A—T"*). 


To make this condition more explicit we recall from Section 4.1.c that the dual of C[0, 1] 
is the space of complex Borel measures on [0,1], the duality between functions ¢ and 
measures [Ll being given by (f,l) = fo jf du. For such measures we compute 


(g,T*u) = (Tg,H) = [ [x.net t) dt du(s) 


= ate) k(s,t) du(s) dt = (g,v), 


where the K-valued measure Vv is given by 


v(B) = (1z,V) =f [esau 


for Borel sets B C [0,1]. Now ps € N(A — 7*) if and only if Au(B) = v(B) for all Borel 
sets B C (0, 1], that is, if and only if 


[ewe = [ [ esnauiya 


for all Borel sets B C [0,1]. In this case ps is absolutely continuous with respect to 
the Lebesgue measure dt. Then, by the Radon—Nikodym theorem, du = vdt, where 


v €L!(0,1) satisfies 
= [esau )du(s )= ['x(s.4065) 


for almost all t € (0, 1). Since both sides are continuous functions of t, the equality holds 
for all t € [0,1]. This means that v solves (Hp). 


7.3.c Fredholm Operators 


Let X and Y be Banach spaces. The following definition is suggested by the Fredholm 
alternative (Theorem 7.17): 


Definition 7.20 (Fredholm operators). A bounded operator T € £&(X,Y) is called a 
Fredholm operator if it has the following properties: 
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(i) dimN(T) <0; 
(ii) codim R(T) < e%, 


The index of such an operator is defined as 
ind(T) := dimN(T) —codimR(T). 
Example 7.21. Here are some examples of Fredholm operators: 


(i) If T is a compact operator, then J — T is Fredholm with index ind(T) = 0. This is 

a restatement of the Fredholm alternative. 

(ii) The left and right shift on @?, 1 < p < &, are Fredholm with indices 1 and —1, 
respectively. 

(iii) For every zero-free ¢ € C(T) the Toeplitz operator Ty on the Hardy space H?(D) 
is Fredholm with index ind(Ty) = —w(@), where w(@) is the winding number of 
@. This is the content of Noether-Gohberg—Krein theorem in Section 7.3.d, where 
the relevant definitions can be found. 


We begin our analysis of Fredholm operators with the observation that such operators 
have closed range. As a result, codim R(T) equals the dimension of the quotient Banach 
space Y/R(T). 


Proposition 7.22. If the range of a bounded operator T © L(X,Y) has finite codimen- 
sion, then it is closed. 


Proof Let Yo be a finite-dimensional subspace of Y such that R(T) M Yo = {0} and 
R(T) + Yo = Y. Then Yo is closed and the bounded operator S : X x Yo — Y defined by 
S(x,yo) := Tx + yo is surjective. By the open mapping theorem, S is open. In particular, 
S(X x (Yo \ {0})) is open. Clearly this set is the complement of S(X x {0}) = R(T) and 
therefore R(T) is closed. 


Theorem 7.23 (Atkinson). For a bounded operator T € @(X,Y) the following asser- 
tions are equivalent: 


(1) T is Fredholm; 
(2) there exist a bounded operator S € &(Y,X) and compact operators K € @(X) and 
Le L(Y) such that 


ST=I-K, TS=I-L. 
If these equivalent conditions hold, the operator S is Fredholm with index 
ind(S) = —ind(T). 


Moreover, S can be chosen in such a way that K =I —ST and L=1I—TS are finite rank 
projections satisfying dim(R(K)) = dim(N(T)) and dim(R(L)) = codim(R(T)). 
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Proof (1)=(2): By Propositions 4.16 and 7.22 there exist closed subspaces Xo C X 
and Yo C Y such that codim(Xq) < o», dim(Yo) < °°, and 


X=N(T)@Xo, Y=R(T)OY%. 


Let P and Q denote the corresponding projections in X and Y onto Xo and R(T), re- 
spectively. The restriction 7 := T|x, is a bijection from Xo onto R(T), injectivity and 
surjectivity both being clear. Since R(T) is closed, the open mapping theorem implies 
that the inverse mapping So := i is bounded as an operator from R(T) onto Xo. Define 
Se L£(Y,X) by S:= Soo Q. Then for all x € X and y € Y we have 


ST x = SoQTx = SoTx = SoT Px = Px =x—Kx with K=I-—P 


and 

TSy =TSoQy = Qy=y—-Ly wit L=I-@Q. 
Since J — P and J — Q are the projections onto the finite-dimensional subspaces N(T) 
and Yo, these projections are of finite rank and hence compact. It also follows that 
dim(R(K)) = dim(N(7)) and dim(R(L)) = codim(R(T)). 

(2)=(1): We have N(T) C N(ST) and hence 

dim(N(T)) < dim(N(ST)) = dim(N(J— K)) < ©. 

Likewise R(T) > R(TS) and hence 
codim(R(T)) < codim(R(T'S)) = codim(R(U—L)) < °%, 


the finiteness of the codimension being a consequence of Theorem 7.17. This completes 
the proof of the equivalence (1) (2). 

It remains to prove the identity ind(S) = —ind(7). Using the notation introduced 
before we have N(S) = N(SoQ) = N(Q) = Yo, so 


dim(N(S)) = dim(Yo) = codim(R(T)) 
and likewise R(S) = R(SoQ) = Xo, so 
codim(R(S)) = codim(Xo) = dim(N(T)). 
As a result, 


ind(S) = dim(N(S)) — codim(R(S)) = codim(R(7)) — dim(N(T)) = —ind(T). 


A concise way of stating the equivalence (1)++(2) is by introducing the Calkin alge- 
bra 


L(X,Y)/H (X,Y). 
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Theorem 7.23 states that an operator T € (X,Y) is Fredholm if and only if its equiv- 
alence class in Y(X,Y)/.# (X,Y) is invertible in the sense that there exists an operator 
Se L(Y,X) such that ST =I mod # (X) and TS =I mod # (Y). 


Proposition 7.24. [fT © 2(X,Y) and Ty € L(Y,Z) are Fredholm, where Z is another 
Banach space, then TrT, € Y(X,Z) is Fredholm and 


ind(7T, ) = ind(7, ) + ind(72). 
Proof Let S; € Y(Y,X) and Sy € #(Z,Y) be such that 
Sit, =I-Ki, TS,=I-L1, SoI,=1-Ko, TS, =1-Ly, 
with K),K2,L1,L2 compact. Then 
(S1S2)(72T1) = Si (I — K2)T = 1 — Ky — S, KoT, =: 1- K3, 
where K3 = K; + S| K2T, is compact. Likewise 
(ToT) ($182) = Th — L1)S2 = 1 — Ly — Th L Sp =: 1 Ls, 


where L3 = Ly + T,L1S2 is compact. Hence Atkinson’s theorem implies that 747) is 
Fredholm. To compute its index, let X;,Y1, ¥2,Z be finite dimensional subspaces such 
that 


X=N(NH)OX, Y=R(N)ON=N(H)Oh, Z=R(h)OX. 
We have x € N(77;) if and only if 
x€N(T) {x1 €X1: Tix1 € N(D)} =N(T) 6 (Tilx,) | (R(T) NN(B)). 
Since 7; acts as an isomorphism from X; onto R(T), we have 
dim(N(771)) = dim(N(71)) + dim(R(7,) ON N(7)). 


Furthermore we have 


dim(N(72)) = dim(R(71) ON(72)) +dim(%1N N(7)). 
Combined with the previous identity this gives 
dim(N(727;)) = dim(N(7;)) + dim(N(72)) — dim(Y ON(2)). (7.5) 
Next, we have 
Z=R(Kh) OZ. = h(R(N) ON) OX 
and therefore 


codim(R(22T;)) = codim(R(72)) + codim(R(7})) — dim(¥ MN N(7%)). (7.6) 
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It follows from (7.5) and (7.6) that 
ind(IoT;) = dim(N(727T))) oz codim(R(7271)) 
= dim(N(7;)) + dim(N(Z>)) — codim(R(Z2)) — codim(R(7})) 
= ind(7,) +ind(7). 


The next three propositions show that Fredholmness is preserved under various oper- 
ations. 


Proposition 7.25. If T © 2(X,Y) is Fredholm and K € 4 (X,Y) is compact, then 
T +K is Fredholm and 


ind(T + K) = ind(7). 


Proof If S€ &(¥,X) and compact operators L; € Y(X) and Ly € Y(Y) are such 
that ST = J —L, and TS = 1 — Ly, then S(T + K) =1—L, +SK =I-—M, with M, = 
L, — SK compact, and (T + K)S = I — Ly + KS = I — M2 with M2 = Lz — KS compact. 
Hence T + K is Fredholm by Atkinson’s theorem. Moreover, by Proposition 7.24 and 
the identity for indices in Atkinson’s theorem, 


0 = ind(I — M,) = ind(S(T + K)) =ind(S) + ind(T +K), 


so ind(T + K) = —ind(S) = ind(T). 


The set of Fredholm operators is open in 2(X,Y): 


Proposition 7.26 (Dieudonné). For any Fredholm operator T © L(X,Y) there exists 
a number 6 > 0 such that for all U € @(X,Y) with ||U|| < 6 the operator T+U is 
Fredholm and 


ind(T +U) = ind(T). 


Proof The proof is a variation on the proof of the openness of the set of invertible 
bounded operators. Let S € @(Y,X) be such that ST = 1—K and TS =1—L with 
Ke £(X) and Le &(Y) compact. Then 


S(T +U)=I-K+SU, (T+U)S=I-L+US. 
If ||U|| < 6 :=||S||~}, then 7+ SU and 1 +US are boundedly invertible and 
(1+ SU) 'S(T +U) =1-(1+SU)"'K, 
(T+U)SI+US)"! =1-Li+us)y, 


where M := (+ SU)~'K and N := L(I+US)~! are compact. We deduce from this that 
T +U is Fredholm: N(T +U) C N((1+ SU)~'S(T +U)) = N(J—M) and the latter is 
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finite-dimensional, and likewise R(T +U) > (T +U)S(I+US)~! = R(I—N) and the 
latter has finite codimension. 
Next, by Proposition 7.24 and Theorem 7.17, 


ind(I + SU)! +ind(S) +ind(T +U) = ind(J—M) =0 


and ind(I+SU)~! =0 by invertibility. By the identity for indices in Atkinson’s theorem 
it follows that 


ind(T +U) = —ind(S) = ind(T). 


Proposition 7.27. If T © &(X,Y) is Fredholm, then T* € &(Y*,X*) is Fredholm and 
ind(T*) = —ind(T). 


Proof IfS ¢ @(Y,X) is bounded and F; € “(X) and Fy € &(Y) are finite rank pro- 

jections such that ST = J — F, and TS =] — Fy, then S*T* =I — Fy and T*S* =1—F; 

with F; and F/' finite rank projections. Hence T* is Fredholm by Atkinson’s theorem. 
We claim that 


dim(N(T*)) = codim(R(T)) (7.7) 
and 
dim(N(T)) = codim(R(T™)). (7.8) 


Together, these identities imply that ind(7*) = —ind(T). We give a detailed proof of 
(7.8) and indicate the changes that need to be made to prove (7.7). 
Since T* is Fredholm we have a direct sum decomposition 


X* =R(T*)@W (7.9) 


with W C X* a finite-dimensional subspace. 
If x1,...,%% is a basis for N(7), by the Hahn—Banach extension theorem we obtain 
X},-.-,%; € X* such that 


(Xi, X}) = Oi); 1 < i,j < k. (7.10) 
Let Z denote the span of x7,...,x; in X*. We claim that 
R(T*)NZ = {0}. 


Indeed, if x* € Z, say x* = ae Cc iXj> then 


(ix) =ci, Lick. 
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If we also have x* € R(T*), say x* = T*y*, then from x; € N(T) we obtain 
ci = (xj,x") = (Tx, y*) =0, 1<i<k. 


This implies x* = 0 and proves the claim. 
Now, for any fixed x* € X*, set 


Then, for i= 1,...,k, 
k 
(xi,6*) = (a,x “hh Xj") (Xi,X5) = (4,X") — (x;,X") = 0. 


This means that €* € N(T)*. By Theorem 5.16, this implies that €* € R(7*). Since 
x* — €* € Z it follows that R(T*) + Z = X*. Together with Z™R(T*) = {0} it follows 
that we have a direct sum decomposition 


X* =R(T*)@Z. (7.11) 


From (7.9) and (7.11) it follows that dim(W) = dim(Z), and dim(Z) = dim(N(T)) 
and dim(W) = codim(R(7*)). This completes the proof of (7.8). 

The proof of (7.7) proceeds along the same lines, interchanging the roles of T and 
T*. We now consider a basis x},...,x; for N(7*) and use the Hahn—Banach theorem to 
obtain x}*,...,x;° € X** such that 


(xj x7 *) = Oi, 1l<ij<k. 
At this point we invoke Theorem 4.47 to obtain x;,...,x, € X such that 


(xj,47) = Oi); 1 < i,j < k. 


With this analogue of (7.10) at hand the proof can be completed as before. 


7.3.d The Noether-Gohberg—Krein Theorem 


Let D and T denote the open unit disc and unit circle in the complex plane, respec- 
tively. We think of T as parametrised by 6 € [—2, 7] and equipped with the normalised 
Lebesgue measure d@ /27. The Hardy space H? (1D) is the Hilbert space of all holomor- 
phic functions on D of the form Yen Cnz” with 


Vials 


neN 
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Since every square summable sequence (Cn)nen defines a convergent power series 
Ynen nz" holomorphic on D, the correspondence between a power series and its co- 
efficient sequence sets up an isometric isomorphism between H*(D) and ¢?(N). With 
respect to the norm 


1/2 
I Flea) = (L lend?) 


neN 


H? (1D) is a Hilbert space. For n € N consider the functions e, € L?(T) defined by 
€n(@) := exp(in@). 


Since (€n)ncN is an orthonormal sequence in L7(T), every square summable sequence 
(Cn)nen defines a convergent sum YpenCnén in L?(T). Denoting this sum by f, its 
Fourier coefficients are given by 


Conversely, if all negative Fourier coefficients of a function f € L(T) vanish, then f = 
Lnen finen as a convergent sum in L(T). Since the Fourier coefficients of functions 
in L?(T) are square summable, we obtain an isometric isomorphism between H? (ID) 
and the closed subspace of L?(T) consisting of all functions whose negative Fourier 
coefficients vanish. In what follows we identity H7(D) with this closed subspace of 
L’(T). As such, H?(D) is the range of the Riesz projection 


Pe > f (nen +S Ff (nen 


neZ neN 


in L?(T) which discards the terms in the Fourier series with negative indices. 
Given a function @ € L*(T) we define the bounded operator Mg on L?(T) by point- 
wise multiplication, 


Mof :=of, féEL(T). 


When Mg is applied to a function f €¢ H > (ID), the resulting function @ f generally does 
not belong to H7(D), but the Riesz projection will take us back to H? (ID). This motivates 
the following definition. 


Definition 7.28 (Toeplitz operators). Given a function @ € L®(T), the Toeplitz operator 
with symbol @ is the operator Ty on H?(D) given by 


Tof :=P(of), f €H°(D), 


where P is the Riesz projection. 
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Every Toeplitz operator Ty is bounded of norm 
IIZ9l < [PIN Moll < 10 I-- (7.12) 


Its Hilbert space adjoint is given by Tj = Tj; this follows from 


(To fla) = (@fledec = (flea) 2c = (SIT%8) 2): 


The following theorem shows that a Toeplitz operator with continuous and zero-free 
symbol ¢ € C(T) is Fredholm, and its index equals the negative of the winding number 
of the closed contour in C \ {0} parametrised by @. As we have seen in Section 6.2, 
for piecewise C! functions @, the winding number is given analytically by the contour 


integral 
2 fl de df o') 
We) = oa, 2 On 2 oa) 


For functions @ that are merely continuous, the winding number can be defined as fol- 
lows. It is an elementary theorem in Algebraic Topology that there exists a unique inte- 
ger n € Z such that @ is homotopic to the curve 0 +> e,(0). By definition, this means 
that there exists a homotopy from @ to e,, that is, a continuous function 


h: [0,1] x [—2,2] + C \ {0} 


such that for all 6 € [—7,7] we have 
h(0,0)=9(@), A(1,@) =e,(8). 


Setting h,(@) := h(t,@), we think of the curves hf, : [-1,2] — C \ {0} as continuously 
deforming ¢ = ho to e, = hy. The winding number w(@) of @ is defined to be this 
integer: 


w(o) =n. 
In particular, the winding number of e, equals n. It is an easy consequence of Cauchy’s 
theorem that this definition agrees with the analytic definition given earlier if @ is piece- 
wise Cl. 
Theorem 7.29 (Noether—Gohberg-Krein). If the function @ € C(T) is zero-free, then 
the Toeplitz operator Ty is Fredholm on H*(D) and 


ind(Ty) = —w(9). 


This theorem is remarkable, as it computes an analytic quantity (the index) in terms 
of a topological one (the winding number). The main ingredient in the proof is the 
following lemma, which implies that the mapping 9 +> Ty from C(T) to “(H?(D)) is 
multiplicative up to a compact operator. 


Lemma 7.30. For all ¢, y € C(T) the operator TyTy — Tyy is compact on H?(D). 
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Proof By the estimate (7.12), the Weierstrass approximation theorem (Theorem 2.3), 
and the fact that uniform limits of compact operators are compact (Proposition 7.5), it 
suffices to prove the lemma for trigonometric polynomials @ and y. 

Let 6 = _ycmem and y = YN__y den be trigonometric polynomials. For j > N 


we have 


(Ty Ty — Tow ej 

M oN M oN 

= y CmdnP(€mP (ene j)) — 
N 


CmadnP (mene j) = 9 
m=—M n=— N 


m=—M n=— 
since n+ j > 0 and hence e»P(ene;) = @mP(€n+j) = ement+j = emene€; in each summand. 
By linearity and density, this shows that (Tj Ty — Tyy)f = 0 for all f in the closed linear 
span of {e;: j > N}. This implies that TyTy — Tyy is a finite rank operator (of rank at 
most N) and hence compact. 


For functions $, y € C?(T) a more precise result will be proved in Section 14.5.d. 

Proof of Theorem 7.29 It follows from the lemma that the mapping 

J:C(T) > Y(H?(D))/.% (H?(D)) 
given by 

I(p) = Ty + 4 (H?(D)) 

is multiplicative: 

JPW) =JOY), Oo WEC(T). 
If @ is zero-free, then 1/@ defines an element of C(T) and 

J(1/9)J(b) = I(b)J (1/0) = JL) = 1+ 4 (HP (D)). 


Stated differently, there exist compact operators K,L € %(H?(D)) such that with S := 
T,/@ we have 


STy =I-K, TyS=I-L. 
By Atkinson’s theorem this implies that Tj is Fredholm. 
It remains to compute the index. To this end let w(@) = 7 be the winding number of 


@ and let A: [0,1] x [—2,z] > C\ {0} be a homotopy from @ to e,. By the continuity 
of the mapping @ ++ Ty and Dieudonné’s theorem (Theorem 7.26), the mapping 


tr>ind(T),), t € [0,1], 


is locally constant, and hence constant, where h,(s) := h(t,s). Since ho = @ and hy = ep, 
in particular we obtain 


ind(Ty) = ind(Z,, ). 
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Moreover, from 


) 


Be, Yee) =P(¥ cjej4n) = ry Cjejtn 


jeN jeN j=(—n)A0 


we see that 


0, > 0, . 2 S 
sn) =| ue codim R(T, ) = ‘. oe 


—n, n 


so that ind(7;,) = —n. 


The following result clarifies why the symbol was assumed to be zero-free. 


Theorem 7.31 (Hartman—Wintner). If @ € C(T) is such that the Toeplitz operator Tg is 
Fredholm on H?(D), then @ is zero-free. 


Proof Since N(Tq) is finite-dimensional and hence complemented, we have a direct 
sum decomposition H?(D) = Xo @N(Ty). Denote by 7 the projection onto N(7y) along 
this decomposition. The operator Ty restricts to an injective bounded operator from Xo 
onto R(TZy), and the latter is a closed subspace of H*(D) by Proposition 7.22. Hence by 
the open mapping theorem there exists a constant C > 0 such that 


|T foll > Cll foll, fo € Xo- 


For f € H*(D) write f = fo +g along the above decomposition. Then, since || /||| := 
C||foll + |lg|| is an equivalent norm on H?(D), 


[IZo fl + lzFll = |Z foll + Ilgll > Cllfoll + llgl > CFA, 
where C’ > 0 is a constant independent of f. For all g € L?(T) we thus obtain 
I|T>Pall + ||zPgl| > C'llPal] > C'(llell— ll — Pall), 


where P is the Riesz projection. Let U € Y(L?(T)) be the bounded operator given by 
Ug(@) :=exp(i@)g(@). Applying the preceding estimate to U"g in place of g and using 
that U” and U~" are isometric, for all g € L?(T) we obtain 


||U-"TpPU"g|| + ||U"2PU" g|| + ||U-"(E—P)U"g|| > C'|lsll- (7.13) 


For every trigonometric polynomial g we have U~"PU"g — g in L7(T). Since these 


polynomials are dense in L?(T) and the operators U* all have norm one, it follows that 


U-"PU"g—> 2, ge fal ). 


This implies that U~"(I — P)U"g — 0 for all g € L?(T) and, using that U commutes 
with Mo ; 


U-"TsPU"g = U-"PMyPU"g = (U-"PU")Mg(U-"PU") g > Mog 
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for all g € L’(T). Also, (U"g|h) — 0 for all g,h € L?(T) and therefore, since z is of 
finite rank, 


mPU"g—0, g€L(T). 
Passing to the limit in (7.13), we obtain 
Mogll>C'llgl, g €L?(T). 
Since TS = Ty is Fredholm, we also obtain 
IMGgll =l|Mogll > C'llgll, g¢L°(1). 


It follows that Mg is invertible (indeed, the inequality for Mg gives injectivity and closed 
range, and the inequality for Mj gives that Mg has dense range). This is only possible if 
@ is zero-free (the inverse is then given by Mj 4). 


Corollary 7.32. For all @ € C(T), the norm of the Toeplitz operator Ty is given by 
II Toll = [I lle. 


Proof Denote by Mg the pointwise multiplication operator f ++ @ f on L?(T). We have 
(My) = {9(0): 0 € T}. If A —Ty = T)_¢ is invertible, then 7,4 is Fredholm with 
index zero, and therefore A — @ is zero-free by Theorem 7.31. But then M,_4 =A —Mg 


is invertible. This argument shows that o(My) C o(Ty). Now the corollary follows from 
the inequalities 


|| ||o = sup{|9(O)|: @ € T} =sup{|A|: A € o(Ms)} 
<sup{|A| : 4 € o(Tp)} < IIToll < I1P]le.- 


In the next theorem we denote by 7, the Toeplitz operator with symbol @(z) = z. Its 
adjoint is the Toeplitz operator 7;* with symbol @(z) = Z. Identifying H(D) with (?(N 
by identifying the function z++ z” with the nth unit vector e,, the operators T, and T; 
are the right and left shift on °(N), respectively. With these identifications in mind, 
the following theorem can be interpreted as giving a precise description of the closed 
subalgebra of ¢?(N) generated by the right and left shift. 


Theorem 7.33 (Coburn). Let F denote the closed subalgebra in £(H?(D)) generated 
by T, and T;. Denoting by H the space of compact operators on H 2 (D), we have: 


(1) FZ={T%+K:¢€C(T), KE 4}; 
(2) the mapping 1: Ty + K > @ induces a multiplicative isometric isomorphism 


FT /H =C(T). 
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As a consequence, the representation of elements in 7 as the sum of a Toeplitz operator 
and a compact operator is unique, and we have the short exact sequence 


0 # — F > C(T) 50. 


In the final statement we used standard terminology from Algebraic Topology: a se- 
quence of mappings is exact if the range of every operator in the sequence equals the 
null space of the next operator in the sequence. 

In the proof, as well as in later chapters, we need the following notation. For elements 
g,h of a Hilbert space H, we denote by g @/h the rank one operator on H defined by 


(g@h)x:= (x|h)g, x EH. 


The bar in this notation serves to emphasise the fact that © is not a tensor product, but 
rather its sesquilinear counterpart in the sense that for all c € K and x € H we have 


(cg)®h=c(g@h), g@(ch) =c(g@h). 


For norm one vectors h € H, the operator h®h is the orthogonal projection onto the 
one-dimensional subspace of H spanned by h. 


Proof The crucial observation is that for all @ € C(T) and K € 4% we have 
I|Z9 + K || > ||To- (7.14) 


In order to prove this it suffices to show that (Ty) C o(7) + K), for this implies the 
claim via 


I|Z9 + Kl] > r(Ty +K) > r(Tp) = ||To|l- 


To prove the spectral inclusion we argue as follows. Suppose A € C is such that A — 
(Ty + K) = T,_9 —K is invertible. Then this operator is Fredholm with index 0. By 
Dieudonné’s theorem, this implies that Tho is Fredholm with index 0. It remains to 
prove that this implies the invertibility of T)_4. 

Suppose now that y € C(T) is such that Ty has index 0 but is not invertible. Then 
Ty has a nontrivial null space. By Proposition 7.27, Ty, = Ty has index 0 and fails to 
be invertible, hence also this operator has a nontrivial null space. This means that there 
are nonzero g,h € H?(ID) such that P(yg) = P(Wh) = 0, that is, wg and Wh have only 
negative Fourier coefficients. Invoking some standard results from the theory of Hardy 
spaces (see the Notes), this can be shown to imply y = 0. 

Applying the preceding argument to y := A — @ it follows that if Tg were non- 
invertible, then @ = /. But then Ty_g-K=—-K is compact and hence noninvertible, 
contradicting our assumption. This concludes the proof of (7.14). 


(1): The inclusion ‘C’ is an immediate consequence of Lemma 7.30. To prove the 
inclusion ‘D’ we must prove that 7 contains all Toeplitz operators with continuous 
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symbol and all compact operators. Given a function @ € C(T), we use the Stone— 
Weierstrass theorem to find a sequence of trigonometric polynomials p, > @ in C(T). 
Then 7,,, —> Ty in operator norm by Lemma 7.32. Since Tp, = pn(T;) we have Tp, € 7, 
and since .7 is closed this implies Ty € 7. 

We prove next that Y contains every compact operator. Let S := T, for brevity, where 
zis shorthand for the function z++ z. We have S*S = J and J — SS* = Pp, the orthogonal 
projection onto the constant functions. These identities show that both J and Po belong 
to 7. Clearly, 


I= TOK 
is a closed ideal in 7 which is closed under taking adjoints, and since Py is compact we 
have Po € -%. We will show that 4% = .%. 


Fix arbitrary f,g € H?(D) and € > 0. There is a polynomial p such that || p(S)1— || < 
€. Then Po(p(S))* € Y and 


||Po(p(S))* — (18 fll = a Po(P(s)"a— 18 flsle)| 
all=liall= 
= sup — |(g/p(S)1) — (sl f))I1A12)| 
IIgl|=llall=1 
= ||p(S)1— fll <e. 


In the same way, for any g € H?(D) and € > 0 there is a polynomial g such that ||q(S)1— 
g|| < €. Then, 


lla(S)\A@f)— (s@A/)Il = igen I(q(S)(1@ f)alh’) — ((g@ f)aln’)| 
= sup |(q(S)1|h) = (glA’)I(AIP)| 
I|Al|=|"||=1 


= lla(S)1—llllfll < ellfl. 


Since € > 0 was arbitrary and -7 is closed, it follows that every rank one operator g ® f 
is contained in .7. By linearity, the same is true for every finite rank operator. Since the 
finite rank operators are dense in .% by Proposition 7.6, it follows that # C %. 


(2): From (7.14) it follows that 
Ty + 4 || = inf ||Ts +K|| > ||To||- 
IZ + 4] = inf, Ty +KI| > Tol 
Together with the trivial inequality ||Ty|| > infxe.x ||T + K|| we conclude that 
I|Z9 + 4] = |ITo ll = [IP lle 


This shows that the mapping Ty + K +> @ is well defined and isometric. Clearly it is 
surjective, and therefore it is an isometric isomorphism. Its multiplicativity follows from 
Lemma 7.30. 
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Using some elementary facts from the theory of C*-algebras, a more transparent al- 
ternative proof of Theorem 7.31 can be given as a corollary to Theorem 7.33. This proof 
is sketched in the Notes to this chapter. 


Remark 7.34. Identifying H?(D) with (?(N) as indicated above, the short exact se- 
quence of the theorem induces a short exact sequence 


0 #(2(N)) 3 7(C(N)) 4+ C(T) 0, 


where .# (¢?(N)) and .7(€?(N)) denote, respectively, the compact operators acting on 
@(N) and the closed algebra generated by the left and right shift in (?(N), and 7 is the 
operator induced by z under the identifications made. 


Problems 


7.1 Give an alternative proof of Proposition 7.5 by using the equivalence of compact- 
ness and sequential compactness. 

7.2 Let X and Y be Banach spaces. Prove that if X is infinite-dimensional and T € 
-£(X,Y) is compact, then there exists a sequence (x,)n>1 of norm one vectors in 
X such that lim, ..7x, =OinY. 

7.3 Let (m,)n>1 be a bounded scalar sequence. 


(a) For 1 < p < %, show that the multiplication operator (Cn)ns1 8 (MnCn)n>1 
is compact on ¢? if and only if limy,_,..m, = 0. 

(b) Does the same result hold for £°? And for co? 
Hint: Compare with Problem 2.32. 


74 Letl<p<q<o. 
(a) Prove that the inclusion mapping ¢? C @4 is not compact. 
(b) Prove that the inclusion mapping L4(0,1) C L?(0, 1) is not compact. 


Hint: Look up Khintchine’s inequality for the Rademacher functions f,,(0) = 
sign(sin(272”@)). 


7.5 For 1 < p< consider the bounded operator T, : L?(0,1) — C[0, 1], 


if 
ThA) = f fo)ds, + (0.1) 
0 
(a) Show that if 1 < p < 0%, then T, is compact. 
(b) Is 7; compact? 


7.6 For any fixed k > 1, find a bounded operator T acting on a Hilbert space such that 
T+! is compact but T* is not. 
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7.8 


7.9 


7.10 


7A1 
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Let g € C[0, 1] be given. Show that the multiplication operator T, : f +> fg on 
C[0, 1] is compact if and only if g = 0. 

Let X and Y be Banach spaces. Show that an operator T € (X,Y) is compact if 
and only if there is a sequence (x7 )n>1 in X* such that lim,_,..x, = 0 in X* and 


|Tx|| < sup|(x,x;)|, «eX. 


n>1 
Hint: For the ‘if’ part consider T as the composition of mappings 
XH ((x,%%) nor Oo Tx 


and use the result of the preceding problem; for the ‘only if’ part use the result of 
Problem 1.23 and the compactness of T*. 

The aim of this problem is to show how part (1) of Theorem 7.11 can be deduced 
from part (2). Let X be a Banach space and let T € #(X) be compact. 


(a) Using Proposition 6.17, show that every nonzero A € do(T) is an eigen- 
value. 

(b) Using part (a) and part (2) of Theorem 7.11, deduce that every nonzero A € 
o(T) is an eigenvalue. 


Let X be a Banach space. Show that for a bounded operator T € -(X) the fol- 
lowing assertions are equivalent: 

(1) T is compact; 

(2) exp(T) —J is compact. 

Hint: To prove the implication (2)=(1) show that for large enough integers k 
we have T = (exp(kT) —1) f(T), where fy(z) = z/(e — 1) is holomorphic in a 
neighbourhood of o(T). 

Let T be a compact operator on a Banach space X, let0 4A € o(T), and let v 
be its algebraic multiplicity. Let X, := P,X be the range of the spectral projection 
associated with the point A. Prove the following assertions: 


(a) a vector x € X belongs to X;} if and only if (A — T)‘x = 0 for some k > 1; 
(b) for all x € X,,} we have (A —T)"x = 0; 

(c) if (A —T)"x =0 for all x € Xj}, thenn 2 v. 
Let X be a Banach space. In this problem we write [7] for the element T + .# (X) 


of the Calkin algebra # (X) /.% (X). Show that the multiplication [S] o[T] := [ST] 
is well defined on .2(X)/% (X) and satisfies 


IS] °[T]llgexysxe x) < MSIL eoo se ey lITII eons x): 


In the terminology introduced in the Notes to this chapter, this shows that the 
Calkin algebra 2 (X)/.% (X) is a (unital) Banach algebra. 
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7.13 Let X be a Banach space and T € Y(X) be a bounded operator. Show that if T* 
is compact for some integer k > 1, then 7+ T is Fredholm. What is its index? 

7.14 Prove that the two definitions of the winding number of a piecewise C! curve 
discussed in Section 7.3 agree. 

7.15 Let @ € L®(T). Prove the following assertions: 


(a) the operator Mg on L(T) defined by 
Mef =f, felt), 

maps H?(D) into itself if and only if @ € H®(D), that is, upon identifying @ 

with a function in L?(T) whose negative Fourier coefficient vanish we have 
geL*(T); 

(b) if @ € L*(T) and w € H™(T), then the associated Toeplitz operators satisfy 

ToTyf =Toyf, f€H?(D); 

(c) if 6, wy € H®(T), then the associated Toeplitz operators satisfy 

TyTpf =Tpyf, f €H?(D). 


7.16 Using notation introduced in Section 7.3.d, prove the following uniqueness result: 
If T € F satisfies 


T=Ty+K=TytL 


with @,w €¢ C(T) and K,L€ .# (H*(D)), then 6 = wand K =L. 
7.17 Show that T € 7 is Fredholm if and only if T = Ty + K, where @ € C(T) is 
zero-free and K € .#(H?(D)). 


8 
Bounded Operators on Hilbert Spaces 


The identification of a Hilbert space H with its dual via the Riesz representation theorem 
makes it possible to consider a bounded operator T and its adjoint simultaneously on H. 
This leads to the important classes of selfadjoint, unitary, and normal operators. Their 
spectral theory is particularly rich. Its full power comes to bear only in the next chapter, 
where we prove the spectral theorem for bounded normal operators. The present chapter 
discusses the elementary theory and, for normal operators T, establishes a generalisa- 
tion of the holomorphic calculus to a calculus for continuous functions on the spectrum 
o(T). Using this calculus, we prove a number of nontrivial results such as the exis- 
tence of a unique positive square root of a positive operator and a polar decomposition 
for general bounded operators. In the last section we establish the celebrated Sz.-Nagy 
theorem on the existence of unitary dilations for Hilbert space contractions. 


8.1 Selfadjoint, Unitary, and Normal Operators 


Throughout this chapter, H is a complex Hilbert space. The following proposition is key 
to several proofs in this chapter. The example of rotation over 5a in R? shows that it 
fails for real Hilbert spaces. 


Proposition 8.1. [fT € @(H) satisfies (Tx|x) =0 for all x € H, thenT =0. 
Proof For all x,y € H, from (T(x+y)|x+y) = 0 we obtain 
(Tx|y) + (Ty|x) =0. (8.1) 
Replacing y by iy we obtain —i(Tx|y) + i(Ty|x) = 0. Multiplying both sides with i gives 
(Txly) — (Ty|x) = 0. (8.2) 


Adding (8.1) and (8.2) gives (Tx|y) = 0 for all x,y € H. This implies the result. 
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The trick used in the proof is called polarisation. 


Definition 8.2 (Normal, unitary, selfadjoint, and positive operators). An operator T € 
-£(H) is called: 


e positive, if (Tx|x) > 0 for all x € H; 

e selfadjoint, if T = T*; 

e unitary, if TT* =T*T =T, 

e normal, if TT* =T*T. 

Here, T* is the Hilbert space adjoint of T (cf. Section 4.3.b). These classes of opera- 
tors can be viewed as operator analogues of the positive real numbers, the real numbers, 
the complex numbers of modulus one, and the real numbers, the positive real numbers, 
respectively. Furthermore, orthogonal projections are analogous to the ‘Boolean’ set 
{0,1}. A number of results support this view: 


— every positive operator is selfadjoint; 
— every selfadjoint operator is the difference of two positive operators; 
— every selfadjoint operator and every unitary operator is normal. 


The first assertion follows from Proposition 8.1. Indeed, if T is positive, then (Tx|x) is 
positive and therefore (T*x|x) = (x|T*x) = (Tx|x) = (Tx|x), that is, ((T —T*)x|x) =0 
for all x € X. The second follows from the spectral theorem in the next chapter (see 
Problem 9.12), and the third is obvious from the definitions. Continuing our list, 


— every invertible operator is the composition of an invertible positive operator and a 
unitary operator; 

— a bounded operator is unitary if and only if it is the complex exponential of a self- 
adjoint operator. 


The first result and the ‘if’ part of the second will be proved in the present chapter; the 
‘only if? part is a consequence of the spectral theorem (see Problem 9.13). 
In terms of spectra we have the following characterisations: 


— anormal operator is unitary if and only if its spectrum is contained in the unit circle; 
— anormal operator is selfadjoint if and only if its spectrum is contained in the real line; 
— anormal operator is positive if and only if its spectrum is contained in [0,°); 

— anormal operator is an orthogonal projection if and only if its spectrum is contained 


in {0,1}. 


Furthermore, normal projections are orthogonal. 
Let us begin by proving an operator analogue of the decomposition of a complex 
number into real and imaginary parts. 
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Proposition 8.3. For every operator T € &(H) there exist unique selfadjoint operators 
A,B € £(H) such that T =A+iB. 


Proof The operators A := 5(T +T7*) and B:= x(T —T*) are selfadjoint and T = 
A+ iB. Suppose we also have T = A’ + iB’ with A’ and B’ selfadjoint. Put U := B— B’. 
Then (iU/)* = —iU* = —iU and also iU = i(B— B’) = (T —A) — (T — A’) =A’ —A, so 
(iU)* = (A! —A)* = A! —A =U. It follows that iV = —iU and therefore B = B’. This 
in turn implies A = A’. 


A complex number satisfies |z| = 1 if and only if there is a real number x such that 
z= el. The operator analogue of the ‘if’ part is contained in the next proposition. 


Proposition 8.4. If T ¢ @(H) is selfadjoint, then e'” is unitary. 


Proof From the expansion e7 =Y” 5 iy ue we see that (e7)*=y"_4 ci" T’ =e", 
It is elementary to check that e'7e~'7 = e~‘T eT — J by writing out the defining power 
series and multiplying them. Alternatively, this identity follows from the multiplicativity 


of the entire calculus of T applied with f(z) = exp(iz) and g(z) = exp(—iz). 


We have the following simple characterisation of unitary operators: 
Proposition 8.5. For an operator U € L(A) the following assertions are equivalent: 


(1) U is unitary; 
(2) U is surjective and ||Ux|| = ||x|| for all x € H; 
(3) U is surjective and (Ux|Uy) = (x|y) for all x,y € H. 


Proof (1)=(@): If U is unitary, then U is invertible (with inverse U ~l = U*) and 
therefore U is surjective. Moreover, (Ux|Uy) = (x|U*Uy) = (aly). 

(3)=(2): Take x= y. 

(2)=(1): We have (U*Ux|x) = (Ux|Ux) = ||Ux||? = ||x||? = (x|x) for all x € H and 
therefore U*U =I. It follows that U* is a left inverse to U. The assumptions further 
imply that U is surjective and injective, hence invertible. The inverse must be equal to 
the left inverse, which is therefore U% It follows that also UU* =I. 


The right shift on ¢? shows that the surjectivity assumption cannot be omitted from 
(2) and (3). 


Example 8.6. The left and right shifts on ¢7(Z) are unitary. Indeed, the adjoint of the 
left (right) shift is the right (left) shift, so in either case the adjoint equals the inverse. 
Similarly, left and right translations on L7(R) are unitary. 


Example 8.7. The Fourier-Plancherel transform on L?(IR“) is unitary; this follows from 
Theorem 5.26 and Proposition 8.5. 
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Projections in Hilbert spaces are orthogonal if and only if they are selfadjoint: 
Proposition 8.8. For a projection P € £(H) the following assertions are equivalent: 


(1) P is orthogonal, that is, its null space and range are orthogonal; 
(2) P is selfadjoint. 


Proof (1)=(2): If P is orthogonal, then x — Px L Py for all x,y € H, noting that 
x— Px € N(P) (since P(x — Px) = Px— P?x = Px— Px =0) and Py € R(P). Therefore, 


(x|Py) = (Px|Py) + (x— Px|Py) = (Px|Py) 
and similarly 
(Pxly) = (Px|Py) + (Pxly — Py) = (Px|Py), 
so (Px|y) = (x|Py) and P is selfadjoint. 
(2)=(1): If P is a selfadjoint projection, then 


(x— Px|Py) = (P*(x— Px)ly) = (P(x— Px)ly) =0 


since P = P*. Since every element in N(P) is of the form x — Px, this shows that N(P) L 
R(P), that is, the projection P is orthogonal. 


We now turn to the study of some spectral properties of Hilbert space operators. From 
Proposition 6.18 we recall that for every bounded operator T on a Banach space we have 


o(T*)=o(T). 
A similar result holds for the spectrum of the Hilbert space adjoint T*: 
Proposition 8.9. For all T © 2(H) we have 

o(T*) =o(T), 
where the bar denotes complex conjugation. 


Proof The proof follows the lines of Proposition 6.18 but is simpler because of the 
Riesz representation theorem. The idea is to prove that A € p(T) if and only if de 
p(T*), and that in this case (R(A,T))* = R(A,T*). 

First suppose that A € p(T). Then 


(F-T*YIR(A,T)Y = (A-TYIRA, TY = [RATA T= =. 

In the same way it is shown that [R(A,T)]*(A — T*) = 1. It follows that A € p(T*) and 
R(A,T*) = [R(A,T)|*. 

If A € p(T*), applying what we just proved to T* gives A € p(T**) = p(T) and 

RAF) BRAT) SRA 
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For unitary operators we have the following simple result. 


Proposition 8.10. If U € @(H) is unitary, then o(U) is contained in the unit circle. 


Proof Since U is an invertible isometry, this follows from Corollary 6.14. 


Alternatively we can apply the spectral mapping theorem of the holomorphic cal- 
culus, which gives o(U*) = o(U~!) = (o(U))~!. Since both o(U) and o(U*) are 
contained in the closed unit disc, o(U) must therefore be contained in the unit circle. 

In the converse direction, a normal operator whose spectrum is contained in the unit 
circle is unitary. The proof of this fact is harder and will be given in Corollary 9.18. 

The next result describes the spectrum of selfadjoint operators. 


Theorem 8.11 (Spectrum of selfadjoint operators). An operator T € &(H) is selfad- 
joint if and only if (Tx|x) € R for all x € H. IfT is selfadjoint on H, then 


||7'\| = sup |(Tx|x)| = max{|m|,|M|} 


I|x|<1 
and 
o(T) C[m,M], m,Meo(T), 


where m := inf), =1(Tx|x) and M := supy yj) (71x). 


Proof If (Tx|x) € R, then (T*x|x) = (x|T*x) = (Tx|x) = (Tx|x). Hence if (Tx|x) € R 
for all x € H, then T = T* by Proposition 8.1 applied to T — T* Conversely if T = T* 
then (Tx|x) = (x|Tx) = (Tx|x) and therefore (Tx|x) € R. 

Next we prove that o(T) C R. To this end let A = @+iB with a, B € R and B £0; 
we wish to prove that A € p(T). For all x € H we have 


[|(A = T)allllxl] > 1A — T)xlx)| = | oc(xlx) — (Txlx) + iB (xlx)| > || Ile, 


using that (Tx|x) € R in the last step. By Proposition 1.21 this implies that A — T is 
injective and has closed range. Replacing A by A, we also conclude that A — T is injec- 
tive and has closed range. By Proposition 4.31, this implies that A — T = (A — T)* has 
dense range. We conclude that A — T is both injective and surjective, hence invertible, 
and therefore A € p(T). 

By now we have shown that o(7T) C R. Next we show that o(T) C [m,M]. Let A = 
M+6 with 6 > 0. Then, by the definition of M, for all x € H with ||x|| = 1 we have 


I|(A = T)al|llxl] > (A =D) xx) = M(a|x) — (Tx]x) + 6(xlx) > (xx) = 5x7. 


The same argument as before shows that A — T is both injective and surjective, hence 
invertible, and therefore A € p(T). This proves that (M,°°) C p(T). Applying this result 
to —T (and replacing [m,M] with [—M,—m]) we also obtain (—%,m) C p(T). This 
completes the proof that o(T) C [m,M]. 


260 Bounded Operators on Hilbert Spaces 


We prove next that ||7|| = max{|m|,|M|}; this implies ||7'|| = sup). <; |(T-x|x)|. Re- 
placing T by —T if necessary, we may assume that |m| < |M|. Clearly we then have 
|M| = sup), |(7x|x)| < ||T||. To prove the converse inequality ||7'|| < |M|, note that 
for all x € H with ||x|| = 1 and all u > 0 we have 


4l|Tx||? = (T(ux+ wo'Tx)|mx+ wo! x) — (T(wx— wT x)|mx— wo'Tx) 
M|||ux+ w!T |? + lux oT? 
M|||ux+ wT x|? + [Mux wo! Tx? 


<| 
<| 


1 1 
7 ee 2) _ Day ed 2 
2|aal(u?be? + alltel?) =21Mi(u? +o lITxIP): 


where the first inequality follows from the definitions of m and M and the next equality 
uses the parallelogram identity. Taking 7 = ||Tx|| we obtain, for all x € H with |||] = 1, 


Al|Tx||? < 2|M|(||Tx|| + ||7x|]) = 41M || TI. 


It follows that ||Tx|| < |M| for all x € A with ||x|| = 1, so ||T]] < |M|. 

The last thing to prove is that m,M € o(T). We prove this for M; the result for m 
follows by considering —T. Replacing T by T — m we may assume thatO =m < M. 
Then (Tx|x) > 0 for all x € H and therefore 


M = sup (Ts|x) = sup |(Tx|x)| = [IT]. 


I|x|=1 I|x||=1 
Choose a sequence (x;,)n>1 of norm one vectors such that lim, -5.0(Tn|Xn) = M. Then, 
|| (M—T)xn\|? = ((M—T)xn|(M —T)xn) 
= M?||xq\| —2M(Txn\xn) + || 7 xn ||? 
<M? —2M(Txnlxn) + ||T ||? = M? — 2M(Txnlxn) + M2, 


which tends to M? — 2M? + M? = 0 as n> ©. This implies that M—T cannot be 
invertible: for if it was, with inverse S, then 


||¥n|] = [SCM —T xn] < ISI] — Tan] > 0, 


which contradicts the fact that ||x,|| = 1. This proves that M € o(T). 


In the converse direction, a normal operator whose spectrum is contained in the real 
line is selfadjoint. This will be proved in Corollary 9.18. 

It is an immediate consequence of Theorem 8.11 that the norm of a selfadjoint op- 
erator equals its spectral radius. More generally this is true for normal operators; see 
Proposition 8.13. This equality of norm and spectral radius can sometimes be used to 
determine the norm of an operator. We illustrate this by determining the norm of the 
Volterra operator. 
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Example 8.12 (Volterra operator). From Example 1.31 we recall that the Volterra op- 
erator is the operator T € “(L*(0,1)) given by the indefinite integral 


S 
= i f(t)dt, f€L7(0,1),s € (0,1). 
0 
The operator T fails to be selfadjoint (it even fails to be normal, see Problem 8.12), but 
we have ||7'|| = ||S||, where S € Y(L7(0,1)) is defined by 
l-s 


(Sf)(s) := A f(t)dt, f€L7(0,1), s€ (0,1), 


as is immediate from the identity (Sf)(s) = (Tf)(1—s). This identity also implies 


1 1 
(F1S*s) = (Sfle) = f(A -s)a(s) as =f (TA(8)EU-s) as 


=[ [10 a—sjards= ff r003 g(1—s) dsdr 
= [ [ roatsyasar= [ po Fa Har = (159) 


which shows that S is selfadjoint. By Example 7.4 S is compact, and therefore by Theo- 
rem 7.11 every nonzero A € o(S) is an eigenvalue. To compute the spectral radius r(S) 
we therefore have to determine the set of nonzero eigenvalues of S. 

Suppose that A # 0 is an eigenvalue of S and let f be an eigenfunction. Then 


1 l-s 
= i; f(tdt, (8.3) 


and the right-hand side is a continuous function of s. It follows that f € C[0, 1]. Then the 
same arguments shows that in fact f € C![0, 1], and applying the same argument once 
more gives f € C?(0, 1]. Differentiating (8.3) twice gives 


f'(0) = fl) =0, 
Af'(s) om fel —s), sE (0, Lb), 
A? f"(s)=—F(s), € (0,1). 
The reader may check that these equations admit a solution if and only if 4 z= ¥tan 


for some n € Z and that the solutions f,,(s) = cos(($ + 2n)s) are indeed eigenfunctions 
of S. The largest eigenvalue of S therefore equals 2 We conclude that 


2 
TI = IIS =7(S) = =. 
The remainder of this section is devoted to studying some spectral properties of nor- 


mal operators. Our first aim is to prove that the spectral radius of a normal operator 
equals the operator norm. For selfadjoint operators this has already been observed as 
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a consequence of Theorem 8.11, and for unitary operators this is an immediate conse- 
quence of Proposition 8.10. 


Proposition 8.13. A operator T € L(H) is normal if and only if 
rx =|[7*x, xr. 
If T is normal, then ||T ||" = ||T"|| for all n € N, and therefore r(T) = ||T|]. 
Proof If T is normal, then 
(|x|? = (Pal Tx) = @lT?* Tx) = (AIT T*®) = (Fal T*x) =| P|. 


In the converse direction, the equality implies ((7*T — TT*)x|x) = 0 for all x € H and 
therefore T*T — TT* = 0 by Proposition 8.1. This proves the first assertion. 
If T is normal, then for all norm one vectors x € H we have 


I|7*Tx|? = (LT*T)°x|x) = ((T*)PT?x|x) = ||)? 
and therefore, since |/T*T|| = ||7'||? by Proposition 4.28, 
ITP = 7°] = (177. (8.4) 


Suppose the identity ||7”|| = ||T'||” has been proved for n = 2,...,k. Then, for all norm 
one vectors x € H, 


I Txl[?* = Tex? = (P*TAs| Ts) 
k k-1 k+1 k-1 k+ k-1 k+1 k-1 
< | P°Tee Tf = TE TE | < P| = IE, 


using (8.4) and the inductive assumption. This results in the identity ||7||**! < ||7**!]]. 
Since the reverse inequality holds trivially, we conclude that ||7**"|| = ||7||**1. 
The final assertion follows from the spectral radius formula (Theorem 6.23). 


Recall that A € C is called an approximate eigenvalue of T if there exists a sequence 
(Xn)n>1 in X such that ||x,|| = 1 for all n > 1 and lim,,.0 || Tx, — Ax; || + 0. By Propo- 
sition 6.17 the boundary spectrum of any bounded operator on a Banach space consists 
of approximate eigenvalues. For normal operators on Hilbert spaces more is true: 


Proposition 8.14. Every point in the spectrum of a normal operator T © £(H) is an 
approximate eigenvalue. 


Proof Suppose A € C is not an approximate eigenvalue. Then A — T is injective (other- 
wise A would be an eigenvalue), and ||Ax—Tx|| = ||Ax—T*x|| implies that also (A —T)* 
is injective, that is, A — T has dense range. Let us prove that A — T has closed range. 
Let (%n)n>1 be a sequence in H such that lim,_,..(A — T)x, =y in H. Then 
lim sup ||(A —T) (x1 —Xm)|| = 0. 


N+ n m>N 
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Unless we have limyy »—0 ||Xn —Xm|| = 0, normalisation allows us to construct an approx- 

imate eigensequence to arrive at a contradiction. Thus limp »—. ||Xn — Xm|| = 0, which 

means that (x,)n>1 is Cauchy and therefore converges to a limit x. Then y = (A — T)x. 
We have shown that A — T is surjective. Since this operator is also injective, it follows 


that A € p(T). 


Theorem 7.11 implies that every nonzero element of the spectrum of a compact oper- 
ator is both an isolated point and an eigenvalue. The final result of this section states that 
for normal operators on a Hilbert space, all isolated points in the spectrum are eigenval- 
ues; no compactness assumption is needed. Normality cannot be omitted: the Volterra 
operator has spectrum {0}, but 0 is not an eigenvalue (see Problem 8.12). 

If T is a bounded operator on H, for A € C we set Ey := {x © X: Tx =Ax}. Thus A 
is an eigenvalue for T if and only if E, is nonzero. We recall that spectral projections 
have been defined in Theorem 6.25. 


Theorem 8.15 (Isolated points are eigenvalues). Let T € “(H) be a normal operator 
and let A be an isolated point in o(T). Then A is an eigenvalue for T and the spectral 
projection pias corresponding to {A} equals the orthogonal projection P, onto E,. 


The proof uses the following simple observation. 


Lemma 8.16. Let T € 2(H). IfY be a closed subspace of H, then: 


(1) if Y is invariant under T, then Y+ is invariant under T*; 
(2) if Y is invariant under T and T*, then Y~ is invariant under T and T* and 


(Tey STW and (Tha = Tha 
as operators in Z(Y) and Y(Y+), respectively. 


In particular, if T is selfadjoint (respectively, normal) and Y is invariant under T and 
T*, then T|y and Ty. are selfadjoint (respectively, normal). 


Proof If Y is invariant under T, then for all y € Y and yt € Y+ we have (y|T*y+) = 
(Ty|y+) = 0. This proves (1). The first assertion of (2) follows as well, and if Y is 
invariant under T and 7%, then for all y,y’ € Y we have 


(y|(Tly)*y’) = (Tlryly’) = (Tyly’) = OIT*y’) = O(T* ivy’). 


This proves the first identity of (2). The second is proved in the same way. The final 
assertion is an immediate consequence of (2). 


Proof of Theorem 8.15 Replacing T by T — A we may assume that A = 0. Let P{ 
denote the spectral projection corresponding to {0} and denote its range by E {0} We 
wish to prove that 0 is an eigenvalue for T, that Ey = E'°!, and that P‘} is an orthogonal 
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projection. Once these facts have been proved, it follows that P‘°} and Pp are orthogonal 
projections onto the same closed subspace of H and therefore are equal. 

As we have seen in Theorem 6.25, T maps E {9} into itself, and if we denote by 
T°} :=T|,,.0) the restriction of T to E}, then o (T°) = {0}. In particular this implies 
that E(°} 4 {0} (trivially, every operator on {0} has empty spectrum). 

Since T is normal, the formula for the spectral projection of Theorem 6.25 implies 


T*POy = a | T*R(A,T)xda = fis | R(A,T)T*xda = POT *x, 
2ni Jr 2ni JT 


where I is a circular contour of small enough radius surrounding 0. This shows that T* 
leaves E‘°} invariant. Hence by Lemma 8.16, the restricted operator T°} is normal as 
an operator on E‘°}. By Proposition 8.13, || 7{ || = r(7!) =0 and therefore T°! = 0. 
This means that 7x9 = Th x9 = 0 for all x9 € EO, so Et! C Eo. Moreover, since E(o 
is nontrivial, 0 is an eigenvalue of T. 

For all y € Eo we have Ty = 0 and therefore 


1 1 1 
{0} = — = oo =I = -1 = 
py aq | RA.Tvaa sai [ RAT a-!T)yda sa fA ydl =y 


with I as before. It follows that y € R(Pto) — E'°}. This proves the inclusion Ey CE. 

It remains to be shown that P'°! equals the orthogonal projection onto Eo. Since Pt 
is normal (this follows from the integral formula for P\), this is a consequence of 
Corollary 9.18. The reader may check that no circularity is introduced; the proof of this 
corollary does not depend on the present result. 


Corollary 8.17. The geometric and algebraic multiplicity of every nonzero element in 
the spectrum of a compact normal operator coincide. 


This justifies the terminology multiplicity to denote the geometric and algebraic mul- 
tiplicity of such a point. 
We have the following commutation theorem for normal operators. 


Theorem 8.18 (Fuglede—Putnam-Rosenblum). [fT € 2(H) is normal and S € 2(H) 
is bounded and satisfies 


ST =TS, 
then 
ST* =T’*S. 
Proof We prove the more general result that if T;, 7 are normal and S is bounded such 
that ST, = T,S, then ST/ = T;S. 
Step 1 — Let V € L(A) be an arbitrary bounded operator. Expanding the exponential 
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as a power series and taking adjoints termwise, we obtain 
lexp(V*—V)]* = exp(V —V*) = [exp(V*—V)]-! 
and therefore exp(V — V*) is unitary. 
Step 2 — By induction, the assumption ST; = 72S implies ST! = T;’S for all n € 


N and therefore Sexp(7;) = exp(72)S. Since the normality of an operator T implies 
exp(T* — T) = exp(T*) exp(—T), this identity and the result of Step 1 imply 


exp(T3)Sexp(—T}') = exp(Z — 12) exp(Ts)Sexp(—T#') 
= exp(T; — Th) Sexp(Ti) exp(—T7') 
= exp(Ty — Th)Sexp(T — T/'). 
Since the two exponentials on the right-hand side are unitary, this gives 
ll exp(73")Sexp(—T/)| = [ISI 
Applying this inequality to the normal operators Z7; and Z7> it follows that 
|| exp(z7z')Sexp(—zT¥)|| = |S], 


so the entire function f(z) = exp(zT3)Sexp(—zTj*) is bounded. By Liouville’s theo- 
rem it is constant, so in particular exp(7)Sexp(—T;) = f(1) = f(0) =S, that is, 
Sexp(T;‘) = exp(Z;)S. Expanding the exponentials as power series and comparing 
terms we obtain ST = T;'S. 


We finish with an observation about relative spectra. Let o/ C &(H) be a unital 
closed x-subalgebra, that is, Y is a unital subalgebra of (H) closed under taking 
adjoints. For such subalgebras we have the following improvement to Proposition 6.19: 


Proposition 8.19. Let. C &(H) be a unital closed x-subalgebra and let T € &. Then 


Proof By Proposition 6.19 and the observation preceding it, for all T € &/ we have 
d0u(T) Co(T) Cog(T). 


First let S € & be a selfadjoint operator. We claim that o.y(S) C R. Indeed, if we 
had A € O.y(S) with A ¢ R, then o,/(S) would have boundary points not belonging to 
R. But do,7(S) C o(S) CR since S is selfadjoint. This proves the claim. It implies that 
O0.4(S) = Ow(S), and since also ow (S) C a(S) C Ow (S) we obtain o,7(S) = o(S). 

Suppose next that T € & is invertible in “(H). Then also 7% is invertible in 7(H), 
and hence so is S = T*T. Moreover, since & is closed under taking adjoints and com- 
positions, the operator S:= T*T belongs to &. By what we just proved, 0.7(S) = 0(S), 
so T*T is invertible in ./. But then T~! = (T*T)~!T* belongs to as well. 
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We have shown that if T € “(H) and0 € p(T), then 0 € p,y(T). Applying this result 
to A —T gives the inclusion p(T) C p.v(T), that is, ov(T) C o(T). 


8.2 The Continuous Functional Calculus 


In Chapter 6 we have seen how to associate a bounded operator f(T) with a bounded 
operator T when f is holomorphic in an open neighbourhood of o(T). Here we will 
prove that for normal operators T acting on a Hilbert space, the functional calculus 
f > f(T) can be extended to continuous functions on o(T). 


8.2.a The Continuous Functional Calculus for Selfadjoint Operators 


We begin with the case of selfadjoint operators. 


Theorem 8.20 (Continuous functional calculus for selfadjoint operators). Let T € 
-£(H) be a selfadjoint operator. Then there exists a unique continuous linear mapping 
f > f(T) from C(o(T)) to L(H) with the following properties: 


(i) if f(z) =z" withn EN, then f(T) =T"; 

(ii) for all f,g € C(o(T)) we have (fg)(T) = f(T)8(T); 
(iii) for all f € C(o(T)) we have f(T) = (f(T))*; 
(iv) for all f € C(o(T)) we have || f(T)|| = || fll. 


The operators f(T) are normal, and f(T) is selfadjoint if and only if f is real-valued. 


Proof For polynomials p(z) = Y*_9¢nz” we define p(T) := Y'_9 nT”. These opera- 
tors are normal and satisfy (i), (ii), and (iii). Moreover, by the spectral mapping theorem 
for the holomorphic calculus, 


\|[p(T)|| = sup{|A| : 4 € o(p(T))} = sup{|A| = A € p(o(T))} = IIpllee 


and therefore (iv) holds. 
By the Weierstrass approximation theorem, the polynomials are dense in C(o(T)). 
Therefore, by (iv) and an approximation argument, the mapping p> p(T) has a unique 
extension to an isometry from C(o(T)) into “(H), and (ii)—(iv) again hold. 
Since normality is inherited in passing to operator norm limits, the operators f(T 
are normal. Property (iii) implies that if f is real-valued, then f(7) is selfadjoint. 


Conversely, if f(T) is selfadjoint, then f(T) = f(T) by property (iii) and therefore 
Il f — Fllcvocr)) = 9 by property (iv), so f = f is real-valued. 
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8.2.b The Continuous Functional Calculus for Normal Operators 


Every polynomial in the real variables x and y can be written as a polynomial in the 
variables z and z by substituting z = x + iy, z= x— iy. For example, x* + y* = zz. For 
polynomials p(z,Z) = yy j=0 Ci jzz/ and normal operators T € “(H), we define 


k ; 
BET =) err 
i,j=0 
The crucial result that enables us to extend the continuous functional calculus to normal 
operators is the following spectral mapping theorem. 


Proposition 8.21. [fT € @(H) is normal and p is a polynomial in z and Z, then 
o(p(T,T*)) = {p(A,A): A € o(T)}. 


Proof By Proposition 8.14, every A € o(T) is an approximate eigenvalue of T, that 
is, there exists a sequence (x,),>1 of norm one vectors such that lim,—,.0 TX, —AXn = 0. 
Then limy_soo 7 *Xp — AX, = 0 by Proposition 8.13. This implies lim,_,.. p(T, T”*)xn — 
p(A,A)xn = 0, so p(A,A) is an approximate eigenvalue for p(T,7*). In particular, 
p(A,Aa) € o(p(T,T*)). This proves the inclusion ‘D’. 

For the inclusion ‘C’, fix an arbitrary u € o(p(T,T*)). We wish to prove the exis- 
tence of aA € o(T) such that p(A,A) = LL. 


Step 1 — Fixing € > 0, we claim that there is a nonzero closed subspace Y of H, 
invariant under both T and 7*, such that 


|| (o(7,7*) — EDly|| <e. (8.5) 


To prove the claim, let S := p(T,T*) — wl. This operator is normal and we have 0 € 
o(S). Let R := S*S. Arguing as above, we find that 0 € o(R). Consider the continuous 
function f : [0,cc) — [0,1] given by 


1, O0<t<e/2; 
f(t):=420-t/e), €/2<t<e; 
0, tS, 


and let f(R) be the selfadjoint operator obtained from the continuous functional calculus 
for selfadjoint operators (Theorem 8.20). We will show that 


Y:={xeH: f(R)x =x} 


has the desired properties. 
Since T commutes with R, it commutes with f(R), and therefore Y is invariant un- 
der T. By the same reasoning, Y is invariant under T*. Moreover, by the properties of 
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continuous functional calculus, for all y € Y we have 


Ry] = RFA) < > FO |lcocay loll = [FO |lepoelll < ell 
This implies 
lISvII? = (Ryly) < [lRyllllyll < ely? 


This gives (8.5). The claim will be proved once we have checked that Y is nonzero. If 
f(2t) £0 for some t > 0, then 2r < € and therefore f(t) = 1. By multiplicativity, 


I F(R) F2R)|| = [lt A FO)F C2) loca) =O 


where f(2R) := g(R) with g(t) := f(2r). It follows that R(f(2R)) C Y. But R(f(2R)) 
is nonzero since 


IF2R)II = [lt FA) MMeporey) 2 FO = 1- 


Step 2 — Given € > 0, let Y be the closed subspace of Step 1. Since Y is nonzero we 
have o(T|y) 4 @ by Theorem 6.11. Pick an arbitrary A € o(T|y). Since Y is invariant 
under both T and 7%, the restricted operator T|y is normal as an operator in #(Y) by 
Lemma 8.16, and therefore A is an approximate eigenvalue of T|y and hence of T. In 
particular, we can find a norm one vector y € Y such that ||Ty—Ay]| < . 


Step 3 — Up to this point, € > O was fixed. Applying Step 2 to a sequence €, | 0, we 
obtain nonzero closed subspaces Y,, of H, norm one vectors y, € Y,, and points A, € 
o(T) such that Ty, — Any, + 0 as n + 0. Passing to a subsequence, we may assume 
that A,, > A, and then Ty, —Ay, — 0 as n > ©. It follows that A is an approximate 
eigenvalue of T, with approximate eigensequence (y,)n>1. By the argument of the first 
part of the proof, 


lim p(T,T*)yn — p(A,A)yn =0. 


n—-o0 


On the other hand, by the inequality of Step 1 applied to €,, we also have 


|_P(T,T* yn — MYn|| < En 


for every n > 1, and therefore we must have p(A,4) = LL. 


With this theorem at hand we can extend the continuous functional calculus to normal 
operators. Repeating the proof of Theorem 8.20 we obtain the following result. 


Theorem 8.22 (Continuous functional calculus for normal operators). Let T € 2(H) 
be anormal operator. Then there exists a unique continuous linear mapping f > f(T) 
from C(o(T)) to 2(H) with the following properties: 


(i) if f(z) =2"2" with m,n EN, then f(T) =T"T"; 
(ii) for all f,g € C(o(T)) we have (fg)(T) = f(T)8(T); 
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(iii) for all f € C(o(T)) we have f(T) = (f(T))*; 
(iv) for all f € C(o(T)) we have || f(T)|| = ||flleo- 
The operators f(T) are normal, and f(T) is selfadjoint if and only if f is real-valued. 
The next theorem extends Proposition 8.21 to continuous functions defined on o(T). 


Theorem 8.23 (Spectral mapping theorem). If T € &(H) is normal, then for all f € 
C(o(T)) we have 

o(f(T)) = f(o(T)). 
Proof The proof of the inclusion o(f(T)) C f(o(T)) follows the lines of Theorem 


6.21. Indeed, if A ¢ f(o(T)), the function g, = 1/f, with fy (z) := A — f(z) is contin- 
uous on o(7), and by multiplicativity we obtain 


sa(T)(A-f(T)) =(A-F(T))8a(T) = (fasa)(T) = (7) =1, 
soa € p(f(T)) and R(A, f(T)) = ga (T). This gives the stated inclusion. 
For the converse inclusion, let A € o(T) be arbitrary and fixed and let up = f(A). 
Using the Stone—Weierstrass theorem, choose polynomials p, such that 
lim sup |pn(z,z) — f(z)| = 9. 
n> -€6(T) 
We may assume that p,(A,/) =. Then, by Proposition 8.21, u € o(pn(T,T*)). Also, 
by property (iv) of the functional calculus, limy—0 ||Pn(T,7*) — f(T) || = 0. By lower 
semicontinuity (Proposition 6.15), this implies u € o(f(T)). 


Corollary 8.24. Let T € &(H) be anormal operator and let f € C(o(T)). Then f(T 
is positive if and only if f is nonnegative. 


Proof If f is nonnegative, the spectral mapping theorem gives o(f(T)) = f(o(T)) C 
[0,cc), and therefore the selfadjoint operator f(T) is positive by Theorem 8.11. Con- 
versely, if f(T) is positive, then o(f(T)) C [0,c¢) by Theorem 8.11, and therefore 


f(o(T)) C [0,c°) by the spectral mapping theorem. 


The next theorem extends Proposition 6.22 to continuous functions defined on o(T). 


Theorem 8.25 (Composition). Let T € £(H) be normal. For all f € C(o(T)) and 
g €C(o(f(T))) we have go f € C(o(T)) and 
(f(T)) = (go f)(T). 


Proof The spectral mapping theorem implies that o(f(T)) = f(o(T)), and therefore 
gof is well defined as an element of C(o(T)). 
First let p(z) = 22" with m,n € N. Then, by the properties of the continuous calculus, 


P(F(T)) = (F(T)Y"F(D)Y" = F"F')(L) = (Po f)(Z). 
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By linearity, the this identity extends to polynomials p. If g € C(o(f(T))) is an arbitrary 
continuous function, the identity follows by approximating g uniformly by polynomials 
Pn Via the Stone—Weierstrass theorem to obtain 


|Pn(f(T)) —a(f(T))Il = Il Pn — allc(o(f(7))) +0 asn—oo 


and 


Il(Pnof)(T)— (go f)(T)Il = llpnof—sefllewoiry 49 asn—ee. 


We finally check consistency with the holomorphic calculus. 


Theorem 8.26. Let T € 2(H) be normal. If f € H(Q), where Q is an open Set contain- 
ing o(T), then f(T) agrees with the operator defined through the holomorphic calculus. 


Figure 8.1 Proof of Theorem 8.26: o(T) = Kj UKo, T=TiUIn 


Proof The set Q is the union of at most countably many disjoint connected open sets, 
and by compactness the set o(7) is contained in finitely many of them. Hence there is 
no loss of generality in assuming that o(T) = Ui Kj Ui Q; = Q, with the sets 
Kj; compact and contained in the connected open sets Q;, which are disjoint. Choose 
bounded open sets U; such that K; C Uj; € U; CQ;,k=1,...,k, in such a way that 
C\ Ui U; is the union of at most finitely many disjoint connected open sets. Finally, 


let = Ufa I’; be an admissible contour for o (7) in the sense described in Section 6.2, 
with I’; contained in Uj. 

By Runge’s theorem there exists a sequence of rational functions r, such that r, > f 
uniformly in U, where U = Ui U;. The operators r,(7) agree in the continuous calcu- 
lus and the holomorphic calculus because this equality holds in the case op polynomials 
and for resolvents, and equality for rational functions follows from this by the multi- 
plicativity of both calculi. Denoting by f°) (7) and f‘") (7) the operators defined by the 
continuous calculus and the holomorphic calculus, respectively, it follows that 


fO(T) = him r4(7) = £(7) 


nooo 
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with convergence in operator norm; the first equality is a consequence of property (iv) 
and the second follows from the estimate 


lIrn(T) F(T) < af lrn(2)- FOMRG,7)II ae 


0 


< —||rn — f lle sup ||R(z,T)||, 
aq lita leq sup IIRC il 


with |I| the length of T. 


8.2.c Applications of the continuous functional calculus 
We now turn to some applications of the continuous functional calculus. 


Proposition 8.27 (Square roots). [fT € &(H) is positive, there exists a unique positive 
operator S € &(H) such that S* =T. 


Henceforth, this operator S will be denoted by T!/2, 


Proof Since T is positive, T is selfadjoint and its spectrum is contained in [0,0°) by 
Theorem 8.11. Hence, f(t) = \/t is a well defined continuous function on o(7). The 
operator S := f(T) is positive by Theorem 8.20, and it satisfies S? = T by the properties 
of the continuous functional calculus. It remains to prove uniqueness. Suppose S is 
another positive operator with the property that S* = 7. With f (t) = Vt, g(t) = 07, and 
h(t) =t we have, by the properties of the continuous functional calculus and Theorem 
8.25, 


F(S°) = f(g(S)) = (fog)(S) =A(S) =S. 
It follows that S = f(S?) = f(T) = f(S2) =S. This completes the uniqueness proof. 


Definition 8.28 (Modulus of an operator). The modulus of an operator T € &(H) is 
the positive operator |7'| := (T*T)!/?. 


Corollary 8.29. If T € &(H) is normal operator, then |T| = f(T), where f(z) := |z|- 


Proof Let g(z) := %. Then f* = g and therefore |T|* = T*T = g(T) = f?(T) 
(f(T))? by the multiplicativity of the continuous functional calculus. Since by Corol- 
lary 8.24 the operator f(T) is positive, the result follows by taking square roots. 


We continue with a polar decomposition result. In view of future applications we 
phrase it for bounded operators T € (H,K), where H and K are Hilbert spaces. The 
modulus of such an operator is the positive operator |7| := (T*T)!/2 
U € £(H,K) will be called unitary if it is isometric and surjective. A partial isome- 
try is a bounded operator V € &(H,K) for which there exists orthogonal direct sum 
decomposition H = Hp ® iy such that V is isometric from Hp into K and zero on iste 


on H. An operator 
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Theorem 8.30 (Polar decomposition). Let T © @(H,K). The following assertions 
hold: 


(1) T admits a representation T = U|T|, with U a partial isometry from H to K which 
is isometric from R(|T|) onto R(T); 

(2) if T is invertible, then T admits a unique representation T = U|T| with U unitary 
from H onto K. 


Proof (1): From 
7x)? = (T*Tx|x) = (|7 |x| IT bx) = [IIT be)? 


it follows that the mapping Uo : |T|x ++ Tx, viewed as a linear operator from R(|T]) 
onto R(T), is well defined and isometric, and by density it extends to an isometry from 
R(|Z|) onto R(T). Moreover, T = Up|T|. Along the orthogonal decomposition H = 
R(\7|) @(R(|7|))+ we now extend Up identically zero on (R(|7|))+ 

(2): The operator T*T is positive and invertible. Hence |T| = (T*T)'/? is invertible 
as well, by the spectral mapping theorem. Set U := T|T|~!. From 


UU =|T OTT TO = |T | TPT | = 7 


and the fact that U is invertible it follows that U* = U—! and U is unitary. To prove 
that U is unique suppose that T= U|7| = U|T| with both U and U unitary. Then |T| = 
U*U|T| =U*U|T|, and since |7| is invertible this implies U*U = I. Multiplying both 
sides with U gives U =U. 


8.3 The Sz.-Nagy Dilation Theorem 


The last section of this chapter is devoted to a proof of the celebrated Sz.-Nagy dilation 
theorem, which asserts that every Hilbert space contraction has a unitary dilation. Since 
this poses no additional difficulties, we take a rather general approach starting from an 
arbitrary group G with unit element e. 


Definition 8.31 (Positive definiteness). A mapping T : G— (H) is called positive 
definite if for all finite choices of g1,...,gy € Gand h1,...,4y € H we have 


N 
Y (L(g! 8n)hin| hm) > 0. 
m,n=1 
Definition 8.32 (Representations, unitary representations). A mapping U : G—> 2(H) 
is called a representation of G on H if U(e) =I and U(g1)U(g2) = U(gig2) for all 
81,22 € G. A unitary representation is a representation whose constituting operators 
are unitaries. 
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The following result connects these two notions. 


Proposition 8.33. LetU:G—> L(H H) be a unitary representation of G on a Hilbert 
space H. Let J: H — H be an isometric embedding of another Hilbert space H into H. 
Then the function T : G— &(H) given by 


T(g) :-=S'U(8)J, 
is positive definite and satisfies T (e) =I and T*(g) = T(g7') forall g € G. 


Proof The identity T(e) = J follows from U(e) =I and J*J = I, and the identity 
U(g~') = (U(g))* implies T*(g) = J*U*(g)J =J*(U(g))"'J = S*U(g"")J = T(g7!). 
To prove positive definiteness, let g1,...,gy € Gand hy,...,hy € H. Then 


N N N 2 
YS (7 (Gin! &n)Pn|Fin) = Yo (Un! U (Bn) Fn|Fin) = || 2 U(n)hn|| > 0. 
m,n=1 mn= n=1 


The next theorem establishes that, conversely, every positive definite function T : 
G — £(H) satisfying T(e) = I and T*(g) = T(g~!) for all g € G arises in this way. 


Theorem 8.34 (Unitary dilations). Let T: G— &(H) be a positive definite function 
satisfying T(e) =I and T*(g) =T(g~ ') for all g € G. There exist a Hilbert space H, 
an isometric embedding J : H + H, and a unitary representation U:G— 2(H H) such 
that 


T(g)h=J*U(g)Jh, heH. 


Proof Let V be the vector space of all functions f : G— H that vanish outside a finite 
set. We claim that 


(filf) = Y¥ (Ts 'sAls’ f(s) 


8.8'EG 


defines a sesquilinear mapping from V x V to C satisfying (fi|f2) = (/2| fi) for all 
fi, f2 € V and (f|f) > 0 for all f € V. Sesquilinearity is clear. Writing fin = yy Mi. ® 
ae m = 1,2 (allowing the possibility that some of the be 


(Alf= ¥ Tee's) A IMAG) = ¥ AIT 's)Ale’)) 


are zero), we have 


8,g/EG 8,g/EG 

(1) 1 (2) z (1) (2) 
aie = : 14.33 (8)1e3 (8A IT (gbeh) = Y(t? [7 (g; 1g). 
gg/eGij=l i,j=l 
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A similar computation gives 


k 
(fila) = 2 (Tee ih Ih’). (8.7) 
LjJ= 
Since T*(g) = T(g~!) for all g € G, the right-hand sides of (8.6) and (8.7) are equal, 
thus proving the identity (f\|f2) = (f2|f1). Positive definiteness implies (f|f) > 0 
The properties established in the claim suffice for the validity of the Cauchy—Schwarz 
inequality. It may happen, however, that (|) = 0 for certain nonzero functions f in V, 
so this sesquilinear form may fail to be an inner product. For this reason we consider 
the vector space quotient V/N, where 


N={feV: (If) =0}. 


Let us prove that N is indeed a subspace of V. It is clear that cf € N for all c € C and 
f €N. Furthermore, if f, f’ € N, then (f|f’) = 0 by the Cauchy—Schwarz inequality, 
and from this it follows that f +f’ EN. 

On the quotient space V /N, the sesquilinear mapping (-|-) induces the inner product 


(f+Nlg+N) := (fg). 


Define H to be the Hilbert space completion of V/N with respect to this inner product. 
To realise H as aclosed subspace of H we identify elements h € H with the class modulo 
N of the functions fj, : G— H given by fy, = 1,.) @h. Then 


(fing Fin) = YY (T(8~'8") Fy (8') Fir ()) = (7 (e) Ai ha) = (haha) 


8g'EG 


since T(e) =/. This implies that the mapping J: h++ f;, is isometric from H into H. 
The linear mapping U : V — V given by 


(U(s)f)(s') =f(g'8'), FEV. 8,8 €G, 


is well defined and preserves inner products; in particular it maps N into itself. Indeed, 
if f € N, then by a change of variables we have 


(U(g)flU(— A= Y (Te 's")F(e-'s")iF(¢- 18’) 


g' g"EG 

= ¥ (T((g-'g')'g"'s") f(g 's")IF(g18')) 
g' g"EG 

= ¥ (res) fe" )if(¢")) = (FIP) =0 
gl g"EG 


Moreover, 


U(g1)U(g2)f = f(z ‘8 ':) = F((g1g2)'-) =U (giga)f. (8.8) 
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Upon passing to the quotient, we obtain a well defined linear mapping, denoted by U (g), 
on V/N which preserves inner products. Therefore U (g) extends to an isometry from H 
into itself, which we once again denote by U (g), and by passing to the quotient in (8.8) 
we see that U(g;)U(g2) = U(g1g2), that is, the resulting mapping U: G> Y(H) isa 
homomorphism. Since each operator U (g) preserves inner products, in order to prove 
that U (g) is unitary it suffices to prove that it is surjective, and for this it suffices to 
prove that each U(g) is surjective. However, the latter is immediate from the definition, 
which implies that every finitely supported H-valued function on G is in the range of 
U(g). It follows that U is a representation of G on H. 

This representation has the desired properties: for all g, 2’ € G and h,h' € H we have 


U(g) fala’) = (U(g) Ape} @))(8') = 1pey (g'8')h = Leh (e/)A 
and consequently 


(J*U(g)sh\h’) 


(U(8)fal fi’) = (Arg; @Al1y-} @H') 
Yo Lye ("Vey (e’)(L((g') 19" alh’') = (T(g) al’). 


gl gl! EG 


The theorem will be applied in the following situation: 
Lemma 8.35. [fT € &(H) is a contraction, the mapping S : Z — L&(H) defined by 
Te: n>1, 
S(n):= 47, n=0, 
(T*)"", n<-l, 
is positive definite and satisfies S*(n) = S(—n) for all n € Z. 


Proof Since T is acontraction, from ((J—T*T).x|x) = ||x|| — ||7x||? > 0 it follows that 
I—T*T is a positive operator. As consequence, by Proposition 8.27 the defect operator 


Dr := (I-T*T)'/? 
is well defined and positive, and we have 
||Prx||? = ((- T*T)'7x|(0— T*T)!2x) = (I T*T lx) = |||? — ||Tl?. 
Define 
O(H) = {= (n)nz1 CH: Y lhl? <o}. 


n>1 


With respect to the inner product (g]h) := 0,51 (8n|/n), 2 (H) is a Hilbert space; com- 
pleteness is proved in the same way as for (7. We define the operator T on (?(H) by 


T shes (Thy,Drhy,h2,hs,...). 
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Clearly 
Pl jaar) = TAI? + (DrAall? + Y [nll = [ball?, 
n>2 
so T is isometric. Define S: Z > Y(H) by 
T", n>1; 
S(n) :=¢ 1, n=0; 


where 7* is the Hilbert space adjoint of T. We make the trivial but crucial observation 
that 


(S(n—m)hlh') = (S(a—m)Jh\JN'), hh! © H, mn >I, 


where J : H — ¢?(H) is defined by Jh := (h,0,0,...). It follows that for all choices of 
hy,...,hy € H we have (with the convention T° = 1) 


N N 
y (S(a—m)hy|hn) = y (S(a—m)Jhy|Thm) 
m,n=1 mn=1 
= LY (F'I|Thm) + YL (Fh |dhim) 
l<m<n<N l<n<m<N 
=F init YY Cal ihe) 
l<m<n<N l<n<m<N 
2 y Fn, |F" Thm) + Yo (FTF Th) 
l<m<n<N l<n<m<N 
NS 33 2 
=| E7af>e 
k=1 


where (+) uses that T is an isometry and consequently (Tg|Th) = (g|h). This proves 
that S is positive definite. 
The identity S*(n) = S(—n) for n € Z is clear from the definition. 


Combining the lemma with the theorem, we arrive at the following result. 


Theorem 8.36 (Sz.-Nagy dilation theorem). If T € -2(H) is a contraction, then there 
exist a Hilbert space H, an isometric embedding J : H — H, and a unitary operator 
U € £(H) such that 


T’=SU"J, neN. 


In this context the operator U is said to be a unitary dilation of T. As a simple exam- 
ple, the left (right) shift on (7 (Z) is a unitary dilation of the left (right) shift on 07. 


8.1 


8.2 


8.3 


8.4 


8.5 


8.6 


8.7 


8.8 
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Prove the Hellinger—Toeplitz theorem: If T : H — H is a linear mapping satisfying 
(Tx\y)=(a|Ty), xy EH, 


then T is bounded. 

Hint: Apply the uniform boundedness theorem to the operators 7, : H + K given 
by Try := (Tay). 

Give a direct proof of Proposition 8.10 based on a Neumann series argument for 
U and U7". 

Let T € £@(H) be selfadjoint. The aim of this problem is to deduce the inclusion 
o(T) CR from Proposition 8.10 (or the preceding problem). Fix A € o(T). 


(a) Show that 


) 


pike saat =eA-1)(Y “(ray 


n=1 


and conclude that e’* — e’” fails to be invertible. 
(b) Combine this with Proposition 8.10 to conclude that A € R. 


Show that a projection P € &(H) is an orthogonal projection if and only if 
||PAl| < ||Al], he dH. 


Hint: The latter condition implies that ||P(g+ch)||? < ||g+-ch||? for all c € K and 
g,h © H. Now consider g € R(P) and h € N(P), and vary c. 

Let Ho be a closed subspace of H and let T € @(H). Prove that if both T and T* 
leave Hp invariant, then o(7|y,) is a subset of o(T). 

Show that the norm of an operator T € &(H) is given by 


|7 ||? =inf{A >0:7*T <A, 
where T*T < AI means that A — T*T is positive. 
Let T € “(#) be selfadjoint and let A € R. 


(a) Show that if o(T) = {A}, then T = Al. 

(b) Show that if o(PTP) = {A}, where P is an orthogonal projection with N(P) 
finite-dimensional, then A — T is compact; if, in addition, P 4 J, then A = 0 
and T is compact. 


The numerical range of an operator T € &(H) is the set 
W(T) = {(Txfx): lal] = 1}. 
The numerical radius of T is defined by 
w(T) := sup{|A|: A © W(T)}. 
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Prove the following assertions: 
(a) IIT <w(7) < (ITI. 
Hint: To prove the first inequality use the identity 
4(Txly) = (Ta+y)|x+y) — (Ty) xy) 
+ i(T (x+iy)|x+ iy) — i(T (x— ty) |x— iy). 


(b) T is selfadjoint if and only if W(T) CR. 
Hint: Consider the operator i(T — T*). 
(c) If W(T) = {A} for some A € C, then T = AI. 
8.9 In this problem we prove the Toeplitz—Hausdorff theorem which asserts that the 
numerical range of any operator T € &(H) is a convex subset of C. 


(a) Show that for all A, € C we have W(AT + wl) = AW(T) +p. Conclude 
that in order to prove the Toeplitz—Hausdorff theorem it suffices to establish 
that {0,1} C W(T) implies [0,1] C W(T). 
In what follows we fix an operator T € (H) such that {0,1} C W(T). We prove 
that [0,1] C W(T). 
Choose norm one vectors x,y € H such that (Tx|x) =0 and (Tyly) = 1. 


(b) Define g : [0,22] > C by 
g(t) =e" (Taly) +e" (Tyla). 
Using that g(t+ 2) = —g(t) for t € [0,2], show that either there exists fo € 
[0, z] such that g(to) = 0 or else there exists to € [0,27] such that g(to) > 0. 
Set y:= ely, 
(c) Show that x and y are linearly independent. 
Define z: [0,1] > H and f : [0,1] > C by 


a) = TE. fle) = (Fete) 


These functions are well defined by part (c). 


(d) Show that f is continuous, real-valued, and satisfies f(0) = 0 and f(1) = 1. 
Deduce that [0,1] C W(T). 


8.10 Prove that for all T € &(H) we have 


o(T) CW(T). 


Hint: First prove that approximate eigenvalues belong to W(T). Then apply the 
Toeplitz—Hausdorff theorem in combination with Proposition 6.17. 


8.11 


8.12 


8.13 


8.14 


8.15 
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Show that if S,T € &(H) are positive operators, then o(ST) C [0,°°). 
Hint: Apply the result of Problem 6.12 to the operators S$ 12 and S'/?7. 
Consider the Volterra operator T on L?(0,1) of Example 1.31: 


(Tf)(t) = [ro ds, fe 12(0,1),1€ (0,1). 


We sketch two proofs that o(T) = {0}. 


(a) Show that for all 0 4 A € C and g € L7(0,1) the equation (A —T) f = g has 
a unique solution in L7(0, 1). Deduce that o(T) = {0}. 


A second proof is obtained by estimating the norm of 7”: 


(b) Show that ||7”|| < 4 for alln = 1,2,... 
Hint: First show that 


r'pi= [PL plsydsan ans 


t t t-1 12 
=) ts) f / | dt, --- dt,—2 dt,_1 ds. 
0 s Js Ss 


(c) Using Theorem 6.23, conclude that o(T) = {0}. 
(d) Show that 0 is not an eigenvalue of T. 
(e) Deduce from (a) or (c) that T is not normal. 


Let Tn, be a Fourier multiplier operator on L7(R@) with symbol m € L*(R¢). 


(a) Show that 7,,, is normal. 

(b) Show that if f is a continuous function on the essential range of m (see Prob- 
lem 6.5), then f(T) is well defined through the continuous functional cal- 
culus and equal to the Fourier multiplier Tyo with symbol fom € L”(R¢). 

(c) Compare this result with Problem 6.6. 


Show that the rotation operator on L?(T) defined by 
Rof(e) = fle). 


is unitary and find its spectrum. 

Hint: Distinguish the cases 0/27 € Q and 0/22 ¢ Q. 

Prove that if T € &(H) is an isometry, then there exist Hilbert spaces G and K 
such that we have an isometric isomorphism of Hilbert space 


H~l(G)OK, 


where ¢?(G) is the Hilbert space of all square summable sequences g = (gn)n>1 
in G with norm || gll% (Gg) = Ynst ||¥n||? and that along this decomposition we have 


T2~SOU, 
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where S is the right shift on (7(G), that is, S maps the sequence g1,g2,... to 

0,g1,g2,-.. and U is a unitary operator on K. This decomposition is known as the 

Wold decomposition. 

Hint: For n € N let H, := R(T"), and for n > 1 let G, denote the orthonormal 

complement of H,, in H,_1. Show that the spaces G,, are all isometric as Hilbert 

spaces and set K :=(\,ey An. 

This problem sketches an alternative proof of the Sz.-Nagy dilation theorem. Let 

T € £(H) be acontraction and Dr = (I— T*T)!/? the associated defect operator. 
A dilation of a bounded operator T on H is a bounded operator T ona Hilbert 

space H containing H isometrically as a closed subspace such that 


T"=PT"J, néN, 
where J is the inclusion mapping from H into H and P = J* is the orthogonal 
projection of H onto H, viewed as a mapping from H onto H. 

(a) Show that TD; = Dr-T. 
On the Hilbert space 
C(H) = {= (in)nz1 CH: Y |Whnl? <0} 
n>1 

we consider the operator S: hr (Th1,Drhy,hz,hs,...). 


(b) Show that S is an isometry, that is, ||Sh|| = ||h|| for all h ¢ @(#). 

(c) Show that 7” = J*S"J for all n € N, where J: H + (?(H) is given by ht 
(h,0,0,...). Conclude that S is a dilation of T. 

(d) Show that a dilation of a dilation is a dilation. 


To complete the proof of the theorem it suffices to show that every isometry has a 
unitary dilation. Accordingly, in the rest of the problem we consider an isometry 
S on a Hilbert space G. 


(e) Show that under these assumptions we have Ds = 0. 


On the Hilbert space direct sum G © G define the operator 


phe S Ds«\ — (S_— Ds« 
“"\Ds —S*) ~~ \O —S*}° 


(f) Show that 


(g) Show that S*Ds» = DsS* = 0 and use this to prove that U is unitary. 
(h) Show that U is a dilation of S. 
Hint: First compute U? and use this for finding U7* and U2*+1, 


9 
The Spectral Theorem for Bounded Normal Operators 


In this chapter we show that normal operators admit a spectral representation as sums 
or integrals of orthogonal projections. We begin by showing that every compact normal 
operator T admits the spectral decomposition 


P=) AP, 
n>1 
where (Ay,)n>1 is the sequence of eigenvalues of T and (P,)n>1 is the sequence of or- 
thogonal projections onto the corresponding eigenspaces. For arbitrary bounded normal 
operators T, the main result of this chapter, the spectral theorem for bounded normal 
operators, provides an analogous representation as an integral 


= | AdP(A). 
o(T) 


9.1 The Spectral Theorem for Compact Normal Operators 


Throughout this chapter we let H be a Hilbert space. 
From Linear Algebra we know that normal matrices can be orthogonally diago- 
nalised. This result admits the following extension to compact normal operators on H: 


Theorem 9.1 (Spectral theorem for compact normal operators). Let T € &(H) be a 
compact normal operator and let (An)n>1 be the (finite or infinite) sequence of its dis- 
tinct eigenvalues. Let (En)n>1 be the corresponding sequence of eigenspaces, and let 
(Pu)nz1 be the associated sequence of orthogonal projections. Then: 


(1) the spaces Ey, are pairwise orthogonal and have dense linear span; 
(2) we have 


LY Age, 


n>1 
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with convergence in the operator norm of £(H). 


Proof The proof of the theorem uses the properties spectra of compact operators on 
Banach spaces established in Theorem 7.11. In the present situation, where the com- 
pact operator acts on a Hilbert space, the proof of this theorem can be considerably 
shortened; see Problem 9.2. 


(1): From Proposition 8.13 (applied to T — A) we see that Tx — Ax = 0 if and only if 
T*x—Ax =0, so A is an eigenvalue for T if and only if A is an eigenvalue for 7* and 
the eigenspaces coincide. 

If ym € Ej and y, € E, are nonzero vectors, then 


Ain(Ym|¥n) = (T¥ml¥n) = (Ym|T*Yn) = (Ym|AnYn) = An(YmlYn)- 


If Ain A An, then this is possible only if (ym|yn) = 0. This gives E, L E, forn 4m. 

Let E := @,>1 En denote the closed linear span of the spaces E,, n > 1. We wish 
to prove that E = H. Suppose the contrary. Then E+ is a nonzero closed subspace 
of H. Since TE, C E, for all n > 1 we have TE C E. Furthermore, if x € E,,, then 
T*x = Ayx € En, 0 T*E, C En. This being true for all n > 1, it follows that T*E C E. 
Hence, by Lemma 8.16 we have TE + Cc E+ and the restriction T+ of T to E+ is normal. 
Moreover, by the very construction of E, T+ has no eigenvalues. By Theorem 7.11, 
this implies that o(7+) C {0}, and since o(T+) 4 @ it follows that o(T+) = {0}. 
Now Proposition 8.13 implies that 7+ = 0. This means that every element of E+ is an 
eigenvector of T+ with eigenvalue 0. This contradicts the observation just made that 
that T+ has no eigenvalues and completes the proof that E = H. 


(2): Let P, denote the orthogonal projection onto E,. Then for all x € H we have 
X= Yn>1 Px with convergence in H. This is clear for every x € E,, and since the span 
of the spaces E,, is dense in H and the operators Rup are orthogonal projections 
and hence have norm one, the convergence extends to all x € H by Proposition 1.19. It 
follows that the sum };,,5; TP,x converges as well, with sum Tx. 

Fix € > 0. The set Ag := {n > 1: |A,| > €} is finite by Theorem 7.11. Let N > 1 be 
so large that Ag C {1,2,...,N}. Fixing x € H and writing x, := P,x, by orthogonality 
we have 


2 
Ix yd Pall = = » TX; — dor = y: AnXn 
d n>N+1 
e 
Yul? Isl?-< Y deal? <e* Ill” 
n>N+1 n>N+1 


Taking the supremum over all x € H with ||x|| < 1 we obtain 


| 


<e2. 
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This completes the proof. 


Let H an K be Hilbert spaces. For h € H and k € K we denote by k@h the operator 
in 2(H,K) defined by 


(h@k)x:= (x|h)k, xe H. 


If H = K andh € H has norm one, then 1® h is the orthogonal projection onto the sub- 
space spanned by h. If T € (H) is a compact normal operator, the eigenspaces cor- 
responding to nonzero eigenvalues are finite-dimensional. Choosing orthonormal basis 
for each of them, from Theorem 9.1 we obtain a representation 


T= 5. Mahi @ly 


n>1 


with convergence in the operator norm, where now (A,)n>1 is the sequence of nonzero 
eigenvalues of T repeated according to multiplicities and (hy)n>1 is an associated or- 
thonormal sequence of eigenvectors. Strictly speaking, the spectral theorem gives con- 
vergence of sum for ‘blockwise’ summation ‘per eigenspace’, but the proof of the the- 
orem may be repeated to obtain the convergence as stated. The geometric and algebraic 
multiplicities of the eigenvalues coincide by Corollary 8.17, so we can unambiguously 
speak about their multiplicity. 

Theorem 9.1 allows us to deduce the following general representation theorem for 
compact operators acting on a Hilbert space. It strengthens Proposition 7.6 which as- 
serted that such operators can be approximated in operator norm by finite rank operators. 


Theorem 9.2 (Singular value decomposition). Let T € &(H,K) be a compact operator, 
where K is another Hilbert space. Then T admits a decomposition 


T =} Ankn @ln 


n>1 


with convergence in the operator norm, where (An)n>1 is the sequence of nonzero 
eigenvalues of the compact operator (ren > repeated according to multiplicities, and 
(An)n>1 and (kn)n>1 are orthonormal sequences in H and K respectively, the former 


consisting of eigenvectors of (T*T)!/2. 


The proof depends on the following lemma. 


Lemma 9.3. If S € Y(H) is a positive compact operator, then its square root S‘/* is 
compact. 


Proof By Theorem 9.1 we have S = ¥,51 VnQn, With (Vj; )n>1 the (nonnegative) se- 
quence of distinct nonzero eigenvalues of S and (Q,)n>1 the sequence of orthogonal 
projections onto the corresponding eigenspaces. Fix € > 0 and let N > 1 be so large that 
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Vn < € for all n > N. Then for N’ > N and x € H we have, by orthogonality, 


|X wo, 
n=N 


= F vllQnl< e¥ lon < e\|xI?. 


This implies that the sum R := P51 Vn / On converges in operator norm. We have R > 0 


and R* = Yn>1 VaQn = S, 80 R=S '/2. This operator is the limit in operator norm of the 


finite rank operators ys 1 vi0,,N > 1, and therefore it is compact. 


Proof of Theorem 9.2. By Theorem 9.1, applied to |T| := (T*T)!/2, which is compact 
by Lemma 9.3, we arrive at a representation 


|T| = Y Antin ® hn, 
n>1 
with convergence in the operator norm, where (Ay, )n>1 is the sequence of eigenvalues of 
|T| repeated according to multiplicities and the orthonormal sequence (h,)n>1 Consists 
of eigenvectors of |T|. Let T = U|T| with U an isometry from R(|T|) onto R(T) as in 
Theorem 8.30. The sequence (k,)n>1 defined by k, := Uh, is orthonormal in K and 


T =) Ankn Olin 


n>1 


with convergence in the operator norm. 


As a second application of Theorem 9.1 we record the following formulas for the 
eigenvalues of a compact positive operator. 


Theorem 9.4 (Min-max theorem). Let T € &(H) be compact and positive, and let 
Ay > An > +++ > 0 be the sequence of its nonzero eigenvalues repeated according to 
multiplicities. Then for alln > 1 we have 


An= inf sup (Tyly)= inf sup ||7)\| 
aim(YJon—1 IIyI|=1 sai IIy|=1 
yLlY yLy 


where the infima are taken over all subspaces Y of H of dimension n— 1. 


Proof Forn= 1 the only subspace ¥ to be considered is {0}. In this case both suprema 
are taken over all norm one vectors y € H and are equal to ||T|| = supjy) <; ||Ty|] = 
SUP iy <1 (Tyly) = A1 by Theorem 8.11; here we use that T is positive. In the remainder 
of the proof we may therefore assume that n > 2. 

Using Theorem 9.1 we select an orthonormal basis (hj) ;>1 for H such that Th; = 
Ajhj; for all j > 1. Let ¥Y C H be any subspace of dimension n— 1 and let H,, denote 
the cae span of the vectors 1,...,h,. Then Y +H, is a nonzero subspace of H, so it 
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contains a norm one vector y. Writing y = )j_, cjhj with Yj, |c;|? = 1, we have 


(Tyly) = Dm ler Ii yen. 
jal 


This proves the inequality 


Ans jaf sup (Typ). 
dim(jen—1 IS! 


The inequality 

inf i aaa < inf sup |I7yIl 
dim(¥)=n-1 De 
holds trivially. To prove the inequality 


inf Ty|| < 
Jat an yl <% 
dim(Y)=n—-1"yy 


let y | Hy,—1 have norm one. Then y = Yj3,,(y|hj)hj and Y jsn |(y|a;)/? = 1. Hence, 


EAolapad) =D azlowape<a2¥ lore 


jen jen jen 


791? =| 


and the result follows. 


Corollary 9.5. If S,T © @(H) are compact operators satisfying 0 < S < T, and if 
Ay Sdn S++ > O and py > Un > -:-: > O are their sequences of nonzero eigenvalues, 
both vepeatta according to mnudtipliéivies then for alln > 1 we have Ay < Mn. 


9.2 Projection-Valued Measures 


This section and the next deal with the preliminaries needed to state and prove the 
spectral theorem for bounded normal operators. 
Let (Q,.#) be a measurable space. 


Definition 9.6 (Projection-valued measures). A projection-valued measure on a mea- 
surable space (Q,-¥) is a mapping P: ¥ — L£(H) that assigns to every set F € F 
an orthogonal projection Pr := P(F) € @(H) such that the following conditions are 
satisfied: 


@) Po =I, 
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(i) for all x € H the mapping 

F +> (Prxjx), Fe F, 
defines a measure on (Q, F). 
For x € H the measure defined by (ii) is denoted by P,. Thus, for all F € F, 
(Prx|x) = F=f 17 dP,. 
From 
Px(Q) = (Pox|x) = (|x) = [lal |? 


we see that P, is a finite measure. 
We make some easy observations: 


e Pz =0. 
Indeed, the additivity of P,, applied to OQ = QU implies 
(x|x) = P,(Q) = P.(QUS) = P,(Q) + P.(@) = (x|x) + P.(D) 
and therefore (Pgx|x) = P.(@) =0 for all x € H. 
e If F\,F © F are disjoint, then the ranges of Pr, and Pp, are orthogonal. 
Indeed, by additivity we have 
(Prun*|x) = PF, U Fo) = Pe(F) + Pe(F2) = (Pr, x|x) + (Pr, x|x) 


for all x € H, so Pr,ur, = Pr, + Pr, by Proposition 8.1. Therefore, 


Pr, + Pr, = Prur Pi UE, (Pr, + Pp,)” = Pr, + Pr, Pr, + Pr,Pr, +Pr,- 


It follows that 
Pr, Pry + Pr Pr, = 0. 
Then, 
(Pr, Pr,)” = Pr, (Pry Pr, Pe, = —Pr, (Pr, Pr Pr, = —Pr, Pr, 
and similarly 
(Pr, Pr, )” = —Pr, Pr, 
Adding these identities gives 
(Pr, Pry)” + (Pp, Pr, )? = 0. 
But 
(Pr, Pr,)?x\x) = (Pr, Pr,x|Pe, Pr, x) = —(Pr, Pr,x|Pr, Pex) = —||Pr, Pr,x||? 
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and similarly 
((Pr Pr )°x|x) = —||Pr, Pr 2ll?- 
Adding, we find 
|| Pr, Pryx||° + ||Pry Pr, x||° = 0. 
This is only possible if 
|| Pr, Pr,x||° = ||Pr, Pr, ||? = 0. 
Since this holds for all x € H, we conclude that Pr, Pr, = Pr, Pr, = 0. In particular, Pr, 
and Pr, have orthogonal ranges. 
e For all Fi,F) € F we have Prag, = Pr, Pr, = Pr,Pr.- 


In the special case of disjoint sets this has just been proved, with all three expressions 
equal to 0. From this special case it follows that 


Pr, Pry = (Pr,\r + Prine) (Pr\R, + Pink) 
= Pr\rPe\r, + Prine Pr\r, + Pr\m Prine + Pan, 
= 0+0+0+ Pro. 
Reversing the roles of F; and F) gives the other identity. 


Example 9.7. If T € @(H) is a compact normal operator, Theorem 9.1 implies that 
the mapping {A} +> P,, where P, is the orthogonal projection onto the eigenspace of 
A, extends to a projection-valued measure on 0 (T). 


9.3. The Bounded Functional Calculus 


Let (Q,.#) be a measurable space. The Banach space of all bounded measurable func- 
tions f :Q— C, endowed with the supremum norm || f||.. = SUPgca | f(@)], is denoted 


Theorem 9.8 (Bounded functional calculus). Let P: ¥ — L&(H) be a projection- 
valued measure. There exists a unique linear mapping ® : By(Q) > &(H) with the 
following properties: 


(i) for all F € F we have ®(1F) = Pr; 
(ii) for all f,g € By(Q) we have ®( fg) = B(f) P(g); 
(iii) for all f € By(Q) we have ®(f) = (®(f))*; 
(iv) for all f € By(Q) we have |®(f)|| < [lflles 
(v) for all fn, f € By(Q), if sup, ||fnlleo < °¢ and fn + f pointwise on Q, then for 


all x € H we have ®(f,)x > ®(f)x. 
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Moreover, for all x € H and f € By(Q) we have 


(®(f)x|x) = i, f dP; (9.1) 
and 
lenal? = f irPar. (9.2 
Q 


The operators ®(f) are normal, and if f is real-valued (respectively, takes values in 
[0,0¢)) they are selfadjoint (respectively, positive). 


Proof For F € ¥ we set B(1f) := Pr, which is (i), and extend this definition by 
linearity to simple functions /. It is routine to verify that ®(f) is well defined for such 
functions and that (ii) and (ii1) hold. 

If f= Yin cjlr, is a simple function with disjoint supporting sets Fj € -¥, the or- 
thogonality of the vectors Pr,x gives 


l@(A)xI? = die? IIPr,xll” < toa cal “Yel? = IF lll? 


It follows that ||®(f)|| < ||f||... For general f € By(Q) we can find a sequence of sim- 
ple functions f,, converging to f uniformly on Q and satisfying || fn|loo < ||f||... From 
|®(fn) — ®(fin) || < |lfn — fnllee we infer that the operators ®(f,,) form a Cauchy se- 
quence in .“(#). It follows that the limit 


P(f) = lim ®(fn) 


noo 


exists, with convergence in the operator norm, and it is routine to check that the limit is 
independent of the choice of approximating sequence. Moreover, 


N®(P)II < limsup || fnlleo < [Ile 


which gives (iv). The general case of (ii) and (iii) now follows by approximation. 
To prove (v) we first establish (9.1) and (9.2). If f = Yin cjl F is a simple function 
with disjoint supporting sets F; € ¥, then 


k k 
f)x|x) = L cil i (Pr, x|x) = Yosh) = [Far 
JI = 


This gives (9.1) for simple functions. The general case follows by approximation and 
dominated convergence. Similarly, 


k 


k 
eA? = Y les? lPeall? = Y eal? (Peale) = - If? dP. 


j=l j=l 
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This gives (9.2) for simple functions. Again the general case follows by approximation 
and dominated convergence. Property (v) follows from (9.2), applied to the functions 
Jtn— f, and dominated convergence. 

To prove normality of ®(/), note that (i) and (iii) imply 


(P(f)*P(f) = PPS) = P(/f)?) =O(NPA) = P(N (O(N). 


Selfadjointness (respectively, positivity) for real-valued (respectively, nonnegative) f is 
immediate from (iii) (respectively, (9.1)). 


In what follows we shall write 
a(f)= [rap = | payar(a) 
Q Q 
for functions f € By(Q); the rigorous interpretation of these integrals is through (9.1). 


Proposition 9.9 (Substitution). Let (Q,.¥) and (Q’,.#') be measurable spaces and let 
f :Q— Q! be ameasurable mapping. If P: F — LH) is a projection-valued measure, 
then the mapping Q: ¥' + Y(H) defined by 


Qrr = Pricer, F'¢ F', 


is a projection-valued measure. Denoting by ® and ¥ the bounded functional calculi of 
P and Q, for all g € By(Q") we have 


@(go f) = P(g). 


Proof The elementary verification that Q is a projection-valued measure is left as an 
exercise. For all F’ € ¥' andx € H, 


[reo rer. = [trve dP, = (Pp-1(eyx|x) = (Qpx|x) = i 1p dQ,. 


By linearity and monotone convergence, it follows that for nonnegative functions g € 
By(Q') and x € H we have 


[sofar= | gdQx, 
Q Q' 


that is, (B(go f)x|x) = (Y(g)x|x). For nonnegative g € By(Q’) the result now follows 
from Proposition 8.1. For general g € By(’) the result follows by splitting into real and 
imaginary parts and considering their positive and negative parts. 


Due to the absence of a reference measure UW on Q we had to work with the Banach 
space By(Q) rather than with a Lebesgue space L*(Q,u). However, the projection- 
valued measure P can be used to define a Lebesgue-type space L®(Q, P) as follows. 
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Definition 9.10 (P-Essential boundedness). A measurable function f : Q— C is said 
to be P-essentially bounded if P({|f| > r}) =0 for some r > 0. We define L*(Q, P) to 
be the space of all equivalence classes of P-essentially bounded measurable functions, 
identifying the functions f and g when P({f 4 g}) =0. 


With respect to the norm 
IIflli~(o,p) = inf {r > 0: PC{|f| > r}) =O} 


the space L®(Q, P) is easily checked to be a Banach space. 
For functions f € L*(Q,P) we obtain a well-defined bounded operator @(f), the 
properties (i) and (ii) holds again, and (iv) improves to the equality: 


Proposition 9.11. [f P: #¥ — L(H) is a projection-valued measure, then for all f € 
L®(Q,P) we have 


I®P)I = Ifllz=,P)- 


Proof The upper bound ‘<’ is an immediate consequence of part (ii) of Theorem 
9.8. The lower bound ‘>’ is proved by observing that the definition of the P-essential 
supremum implies that for all € > 0 the projection Pr, is nonzero, where 


eA Sa —€)|lfllz=@,p)}- 
Then, for all x € R(Pr,), 
lanai? = [Par > 0-2) an) [trea 
= (1-€)"lflli-(o.p)llPrexll? = (1 —€)7llfllzco.pllall?. 


This shows that ||®(f)|| > (1 — €)||f||z=(a,p). Since € > 0 was arbitrary, the result fol- 
lows from this. 


We now turn to the special case of projection-valued measures defined on the Borel 
o-algebra A(K) of a compact subset K of the complex plane. In that case we can 
consider the function 


id(A) :=A. 
The properties of the operator (id) are summarised in the next proposition. 


Proposition 9.12. Let K C C be compact and let P: B(K) + LH) be a projection- 
valued measure. Define the bounded operator Tp © Y(H) by Tp := ®(id), that is, 


Te (id) = | AaP(A). 
Then: 


(1) Tp is normal; 
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(2) the spectrum of Tp is contained in K; 

(3) the support of P equals o(Tp) in the following sense: 
(i) Peau #0 for all open sets U C C such that o(Tp) NU #4 2; 
(ii) Pg = 0 for all Borel sets B C K such that (Tp) B=. 


The operator Tp is selfadjoint if K CR, positive if K C |0,°), unitary if K CT, and an 
orthogonal projection if K C {0,1}. 


Proof Let ®: B,(K) — (A) be the bounded functional calculus associated with P 
and write Tp =: T for brevity. 

Part (1) is immediate from the properties of the bounded calculus. 

If Ag € CK, the functions A + Ay — A and A+ (Ag — ay} are bounded on K and 
multiply to 1. In view of B(1) = Px =1 we have Ag — T = P(Ag — id), and property (iii) 
of the bounded calculus shows that Aj — T is a two-sided inverse of ((Ag — id)~!). It 
follows that Aj € p(T) and R(Ao,T) = ®((Ao — id) ~). This proves (2). 

Next we show that Pg = 0 for all Borel sets B C K such that o(T)NB= 2. 

Step I — Suppose first, for a contradiction, that there is a Borel set B C K such that 
BNo(T) = @ and Pg = 1,(T) # 0. By the additivity of P, there exists a half-open 
rectangle R; of sufficiently small diameter p > 0 such that Ri} N0(T) = @ and Pg, = 
13, (T) 40, where B} = BNR. Proceeding inductively we obtain a sequence of nested 
half-open rectangles R; D R2 > ... such that diam(R,) < 2~"*!p, R, No(T) = @, and 
Pz, = 18, (T) £0 for By := BORn. Let \y51 Rn =: {Ao} and note that Ag € B. 

Let x, € R(1z,(7)) have norm one. Since 1g,(T) is a projection we have x, = 
1, (T)xp, and for all y € H we obtain, using the multiplicativity of ®, 


|| Pn — Aoxnl|” = ((T — Ao)*(T — Ao) 1p, (T)4nlan) = I |A —Ao|*Lp, (A) dP, (A) 
and therefore 
|| 7X» — Aoxnl|? < sup |A — Ao|71z, (A) < diam?(B,) < 2-2"*?, 
eK 
This means that Ap is an approximate eigenvalue for T, so Ap € O(T). We also have 


Ao € B and BN o(T) = @. This contradiction proves that Px = 0 for all Borel sets 
BC K whose closure is disjoint of o(T). 


Step 2 — Now consider a general Borel set B C K disjoint with o(T). The Borel set 
1 
BY) :-=BN{A EK: d(d,o(T)) > -} 
n 


has closure disjoint from o(7) and consequently P,i,, = 0 for all n > 1 by what we 
already proved. In particular we have P.(B™) = 0 for all x € H, and by monotone 
convergence it follows that P,(B) = (Pgx|x) = 0 for all x € X. This implies Pg = 0. 
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This completes the proof of the support property (i). It implies that there is no loss of 
generality in assuming that K = o(T). Assuming this in the rest of the proof, we now 
turn to the proof of the support property (ii). 

Let U C C be an open set such that o(7) NU # © and suppose, for a contradic- 
tion, that Pg(7),y = 0. Then Pg = 0 for all Borel sets BC o(T) NU. This implies that 
f o(r) f dP = 0 for all simple functions f supported on such sets, and by approximation 
the same is true for all bounded Borel functions supported in o(T) NU. In particular, 


if learinyd dP(A) = 0. 
o(T) 
Let P: o(T)\U > L(A) be the restriction of P to o(T) \U. Since 


Pocr)\u = Po(r)\u = Por) =I, 


P is a projection-valued measure. Denoting T := ®(id) the associated operator, we have 


f= AdP(A =| 1 Ada) = | AdP(A) = 
eas a(n) Loe (A) es (A) 


and therefore o(T) = o(T) C o(T)\U by (2), which is absurd. 


If K C R (respectively K C [0,¢)), then (Tx|x) € R (respectively (Tx|x) € [0,°°)) 
for all x € H and therefore T is selfadjoint (respectively positive). If K C T, then T is 
invertible by part (2) and 


= f Zap(a)= f A-'aPa) = 


and therefore T is unitary. If K C {0,1}, then T = 0 if K = {0} and T = [,AdP(A) = 
Py} if 1 € K. In both cases we see that T is an orthogonal projection. 


It follows from the proposition that P restricts to a projection-valued measure on 
o(Tp) in a natural way. Accordingly we have 


= i A dP(A). 
o(Tp) 


The spectral theorem for bounded normal operators, which will be proved in Section 
9.4, asserts that conversely for every normal operator T € “(H) there exists a unique 
projection-valued measure P on o(T) such that T = Tp, that is, 


T= [| AaP(A) 


This will allow us to prove converses to the four implications in the final assertion in 
the proposition (see Corollary 9.18). 
We have the following uniqueness result: 
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Proposition 9.13 (Uniqueness). Let P and P be projection-valued measures on a com- 
pact set K C C and define the operators Tp and Tz as before. If Tp = Tz, then P = P. 


Proof Let us write T := Tp = Tz. Then T = ®(id) = ®(id) and T* = ®(id) = ®(id), 
where ® and ® are the bounded calculi associated with P and P, respectively, and 
id(A) = A. By the multiplicativity of the calculi, 

T"T* = ®(id" id’) = ®(id)"@(id)"” = B(id)"@(id)" = B(id'"id"). 


It follows that p(T’) = ®(p) = ®(p) for all functions p(z) = g(z,Z) with q a polynomial 
in two variables, and then f(T) = ®(f) = ®(f) for all f € C(o(T)) by approximation 
using the Stone—Weierstrass theorem (Theorem 2.5). This means that 


[nf [ nf? feC(o(T)). 


If R is an open rectangle in C, and if 0 < f, t 1g pointwise on K with each f,, continuous 
on K, we find that 


P.(o(T)AR = Ip dP, = li de 
(o(T)NR) bean Jim His 
= lim fab, = f 1p dP, = P,(o(T) AR). 
o(T) o(T) 


n—-o0o o(T) 


This means that P,(o(7) NR) = P(o(T) NR) for all open rectangles R. By Dynkin’s 
lemma (Lemma E.4), this implies that P.(B) = P,(B) and therefore 


(Pax|x) = P.(B) = P,(B) = (Ppx|x) 


) 
for all x € H and Borel subsets B of o(T). It follows that Pg = Pz for all Borel subsets 
B of o(T). Since P and P are supported on o(T) this completes the proof. 


9.4 The Spectral Theorem for Bounded Normal Operators 
We are now ready to state and prove the spectral theorem for bounded normal operators. 


Theorem 9.14 (Spectral theorem for bounded normal operators). Let T € @(H) bea 
normal operator. There exists a unique projection-valued measure P on o(T) such that 


= | A dP(A) 
(7) 


For the proof of Theorem 9.14 we need the following elementary consequence of the 
Riesz representation theorem. 
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Proposition 9.15. Let a:H x H > C be a sesquilinear mapping with the property that 
there exists a constant C > 0 such that 


la(x,y)| <Cllall|ly]], ey € A. 
Then there exists a unique operator A © &(H) such that 


Moreover, \|A|| < C, where C is the boundedness constant of a. 


Proof For all y € H, the mapping x+> a(x,y) is a bounded functional on H, and the 
Riesz representation theorem gives a unique w = w(y) € H satisfying 


a(x,y) =(a|w(y)), xe. 
Let B: y+ w = w(y) be the resulting mapping. Then, 
a(x,y) = (x|By), xy EH. 


We claim that B : H — H is a bounded operator. Indeed if c;,c2 € K and y,, y2 € A, for 
all x € H we have 


(x|B(ciy1 +.c2y2)) = a(x, ciy1 + c2y2) = C1a(x,y1) + C2a(x,y2) 
= C1 (x|By1) + €2(x|By2) = (x|c1 By) + ¢2By). 
Since this equality holds for all x € H it follows that B is linear. Furthermore, 
||Bx|? = (Bx|Bx) = |(Bx/Bx)| = |a(Bx,x)| < C||Ball|lx. 


Consequently ||Bx|| < C||x|| for all x € H, so B is bounded with ||B|| < C. The operator 
A := B* has the required properties. 


Proof of Theorem 9.14. We begin with existence. For x, y € H consider the linear map- 
ping $y :C(o(T)) + C, 


dry(f) = (F(T)aly), 


where f(T) is given by the continuous functional calculus of T. The bound || f(7)|| < 
|| f||-o implies that @,. is bounded and 


Il@xyl] < [lllllyl- 


By the Riesz representation theorem (Theorem 4.2) there exists a unique complex Borel 
measure P, y on (7) such that 


(FT Ixb) = f Fae fEC(o(T)). (9.3) 
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Note that 
lP.yll = sup |(f(T)xly)] < [lI [byl 


If llo< 


since || f(T)|| < || flo. Hence by Proposition 9.15, for all f € B,(o(T)) there exists a 
unique bounded operator on H, which we denote by f(T), such that 


(F(T )xy) = [FP 0.4) 


for all x,y € H. For f € C(o(T)), x =y, and P, := P,x, (9.4) is consistent with (9.3). 
Taking f(A) =A and x = y gives the identity in the statement of the theorem, 
(Tx|x) = dP, (A), 
o(T) 
except that it remains to be proved that the measures P, come from a projection-valued 


measure. The remainder of the proof is devoted to showing that this is indeed the case. 
For Borel sets B C o(T) we define the bounded operator Pg € 2(H) by 


Pz = 1,(T). 


This operator is positive since 1g is nonnegative, and for all x € H we have 
(Pexix)=[ IndPy = PCB). 
o(T) 


By the multiplicativity of the functional calculus, PP, = Pg Pg for all Borel sets B, B’ C 
o(T). Proving that Pg is a projection takes a bit of effort since we must proceed from 
first principles; once we have established the existence of a projection-valued measure 
P generating the measures P,, various steps in the argument appear as special cases of 
properties of the bounded functional calculus for P already established in Theorem 9.8. 
We first assume that U is open as a subset of o(T) (that is, we assume that U = 
U'No(T) for some open set U’ C C). Choose a sequence of real-valued continuous 
functions f, € C(o(T)) satisfying 0 < f, ¢ 1y pointwise. Then, for all x € H, 


(Posts) = f 


ly dP, = lim | fadP; = lim (f,(T)a\x) (9.5) 
o(T) n> Jo(T) neo 


by the monotone convergence theorem. By polarisation this implies 
(Puxly) = lim (fn(T)-ly) 
noo 
for all x,y € H. Then, 
(Poxly) = lim (fa(T)x|Puy) = lim lim (f.(T)a|fn(T)y) 


© jim tim (fn(T)fa(T)x\y) = lim lim ((fnfa)(T)x1y) ™ (Pualy), 


noo m—oo no m—>oo 
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where (i) uses that f,, is real-valued, implying that f,,(7) is selfadjoint, (ii) follows from 
the multiplicativity of the continuous calculus, and (iii) follows by repeating the steps 
of (9.5). This proves that Py is a projection. 

For general Borel sets B C o(T) we use the outer regularity of the measure P, (see 
Proposition E.16) to find, for every n > 1, an open set U, (depending on x) containing 
B and such that U,, \ B has P,-measure less than 1/n. Then 


1 
0 < (Py,\Bx|x) = P.(Un \ B) < a 
It follows that Py,,\g is a positive operator whose positive square root satisfies 


1/2 I 
Piet? = (Pop\arls) < =. 
By Theorem 8.11 we have 0 < Py,\g </ implies o(Py,\g) C [0,1], and the spectral 


mapping theorem (Theorem 8.23) implies that o(PH! ‘ 3 


of Theorem 8.11 gives 0 < pil : p </ and therefore Pi! vall < 1. As a consequence, 


) € [0,1]. Another application 


1/2 1 


Since Pg and Py, commute we have Pz — Pr = (Pp + Py, )(Pe — Py,) and, since Pg — 


1/2 


lene 


xl < |IF; 


1/2 
[/Pu,\axll < [Pa ae 


Un\B 


Pu, = Py,\B; 
\|Px— Paxl| < || (Ps — PG, )xll +||(Pi, — Pon )all +11 (Pon — Pa) =| 
ee —_— 
=0 


3 
S (Po + Poll + DilPuna)all < Fe 
where we used that ||Pr|| = supy.j<y(Prx|x) = supy <1 ||Pc(F)|| < 1 for Borel subsets 
F C o(T). This being true for all n > 1, we conclude that Sad = Ppx. Since x € H was 
arbitrary, this proves that Pz is a projection and B ++ Pg is a projection-valued measure. 
Uniqueness follows from Proposition 9.13. 


Example 9.16. Let H = L’(0,1). The position operator X € &(H) is defined by 
Xf(x):=2xf(x), x€ (0,1), fe L701). 
We have o(X) = [0, 1], and the projection-valued measure of X is given by 
Paf =1nf 


for all f € L?(0, 1) and Borel subsets B of (0, 1] (see Problem 9.9). In particular, 13(X) = 
0 for any Borel null set B of [0, 1]. As a consequence, for all @ € L®(0, 1) the operator 
¢(X) is well defined as a bounded operator on H. In fact we have @(X) = Ty, where 
Ty f (x) := @(x) f(x). The operators @(X) arise quite naturally as follows: 
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Proposition 9.17. An operator S € @(H) commutes with the operator X (that is, SX = 
XS) if and only if there exists @ € L* (0,1) such that S = 9(X). 


Proof The ‘if’ part being easy, we concentrate on the ‘only if” part. Let S € “(H) 
be such that SX = XS. To show that there exists @ € L°(0,1) such that S = 9(X), a 
natural candidate for @ is the function $1 (which a priori is an element of L’(0, 1)). For 
Jn(x) := x" with n € N we have f, = X" fo = X"1 and therefore 


Sfn = SX"1 = X"S1 =X" = [x fr(x)O(a)]- 


By the Weierstrass approximation theorem, the polynomials are dense in C[0, 1] and 
hence in L7(0,1) (as C[0, 1] is dense in L?(0,1)). By a limiting argument we conclude 
that Sf = [x fn(x)@(x)]. The boundedness of S$ implies the boundedness of the mul- 
tiplier f + @f, which in turn implies that @ € L*(0,1). 


We are now in a position to prove the assertions made in Section 8.1. 
Corollary 9.18. Let T € @(H) be anormal operator. 


(1) T is selfadjoint if and only if o(T) CR; 

(2) T is positive if and only if o(T) © [0,°°); 

(3) T is unitary if and only if o(T) CT; 

(4) T is an orthogonal projection if and only if o(T) © {0, 1}. 


Furthermore, if T is a projection, then it is an orthogonal projection. 


Proof The ‘only if’ statements have already been proved in Chapter 8. The ‘if? state- 
ments follow from Theorem 9.14, either by combining it with the final assertion of 
Proposition 9.12 or by the following direct reasoning. 

Write T = Jg¢7) A dP(A) as in Theorem 9.14. If o(T) is contained in the real line, 
then, by Theorem 9.8, 


T* = | 7 dP(A) = | AdP(A) =T. 

o(T) o(T) 
If o(T) is contained in the nonnegative half-line, then for all x € H we have 

(Tx|x) = AdP,(A) > 0. 

o(T) 
If o(T) is contained in the unit circle, then T is invertible and 
=), Zap(a)= [| a-!aP(a)=T~!. 
o(T) o(T) 

If T is anormal projection, the spectral theorem for normal operators gives 


T= ie ge =0-Proy +1- Pry =Prn, 
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which is an orthogonal projection. Since the spectrum of any projection is contained in 
{0,1}, this also gives the final assertion. 


As the example of the Volterra operator V shows (see Example 8.12 and Problem 
8.12), normality cannot be omitted in parts (1), (2), and (4). The operator J+ V shows 
that normality cannot be omitted in part (3). 

For normal operators T € (H) and functions f € By(o(T)), the operator B(f) € 
-£(H) defined in terms of the projection-valued measure of T by is denoted by f(T), 


f(T) =o(f)= | far 
o(T) 
Therefore the properties of the bounded calculus for ® translate into corresponding 
properties for the mapping fH f(T): 


Theorem 9.19 (Bounded functional calculus for normal operators). Let T € £(H) be 
normal. Then: 
(i) for f(z) =z"z" we have f(T) =T"T*"; 
(ii) for all f,g € Bo(o(T)) we have (fg)(T) = f(T)8(T); 
(iii) for all f € By(o(T)) we have f(T) = (f(T))*; 
(iv) for all f € By(o(T)) we have || f(T)| < ||f lle 


(v) for all fn, f © Bo(o(T)), if sup, || fnlleo < c° and f, — f pointwise on o(T), 
then for all x € H we have f,(T)x > f(T)x. 


Moreover, for all x € H and f € By(o(T)) we have 
(F(T)x\x) = f far, 
and 
lnm al? = f lr ar,. 


The operators f(T) are normal, and if f is real-valued (respectively, nonnegative) they 
are selfadjoint (respectively, positive). 


Proof Everything but (i) follows from Theorem 9.8; (i) follows from (ii) and (iii). 


Theorem 9.20 (Composition). Let T € “&(H) be anormal operator, let f € C(o(T)), 
and put K := o(f(T)) = f(o(T)). Then: 


(i) the projection-valued measure Q of f(T) is given by 
On =P;p), BEAK). 
(ii) for all g € By(K) we have 
(gof)(T) =g(f(T)). 
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Proof Once the first assertion has been proved, the second is merely a restatement of 
the substitution rule of Proposition 9.9. 

Let P and Q denote the projection-valued measures on o(7) and K of the normal 
operators T and f(T), respectively, and define the projection-valued measure QO on K 
by Op — Pr-1(B) for Borel sets B C K. 

Let U be a relatively open subset of K. Approximating 0 < f, ¢ ly pointwise with 
the functions f,, in C(K) and using the result of Step 1, for all x € H we obtain 


Q.(U) = | WwaQ, = Jim | en 40, 
= lim (gn( F(T) x18) = lim ((gn0 f)(T))x1x) 


= lim snofar,= | tyo far = | 1y dQ, = O,(U). 
T) o(T) K 


n—-oo o( 


The inner regularity of the finite Borel measures Q, and QO, now implies that these 
measures are equal. 


It is of some interest to revisit the case of a compact normal operator T. The spectrum 
of T is then a finite or infinite sequence (A,),>1 with 0 as its only possible limit point. 
In Theorem 8.15 we have already shown that for any nonzero A € o(T), the orthogonal 
projection P, onto the corresponding eigenspace equals the spectral projection P\"} of 
Theorem 6.25. 


Proposition 9.21. Let T € Y(H) be a compact normal operator and let P be its 
projection-valued measure. Then for all nonzero 0 € o(T), 


Pa, =P = Py. 


Proof Putting P, 4} ‘= P, and extending this definition by putting Pro} := 0 if 0 is not 
an eigenvalue, the spectral theorem for compact normal operators implies that P defines 
a projection-valued measure and 


(Txx)= YY A(Paxlx)= YY A(Pyalx) 


Aeo(T) Aco(T)\{0} 


=a) AdR.(2) = | AdP,(A). 
o(T)\(0} o(T) 


Hence, by the uniqueness theorem for projection-valued measures, P = P. 


Further results along this line are given in Problem 9.7 and Theorem 10.56. 
We conclude this section with two famous results due to von Neumann. 


Theorem 9.22 (von Neumann). If T € &(H) is a contraction, then for all polynomials 
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p in one complex variable we have 


I|P(P)I| < sup |[p()|- 
|z|=1 

Proof First suppose that U is a unitary operator. Then o(U) is contained in the unit 
circle. Since unitaries are normal, by Theorem 9.14 we have U = f. o(u) 4 dP(A), where 
P is the projection-valued measure of U. Then, by Theorem 9.8 and the fact that o(U) C 
T, 

|P(U)|| = sup |p(z)| < sup |p(z)].- (9.6) 

zeo(U) \z|=1 


Next let T be a contraction. By the Sz.-Nagy dilation theorem (Theorem 8.36), T has 
a unitary dilation U, so that T” = J*U"J for some isometric operator J and all n € N. 
Then p(T) = J*p(U)J, so ||p(T)|| < ||p(V)|| and the result follows from (9.6). 


As an application of the spectral theorem for normal operators we prove von Neu- 
mann’s theorem on pairs of commuting selfadjoint operators. Two projection-valued 
measures P: % — Y(H) and P’: ¥ — Y(H) are said to commute if Pr and Pz, 
commute for all F,F’ € F. 


Lemma 9.23. Let P!,...,P* be commuting projection-valued measures on compact 
Hausdorff spaces K,,...,Kx, respectively. There exists a unique projection-valued mea- 
sure Pon K = Kj X--- X Kx such that 


Pa, x---xB, = Pp, 0-7 OPH, 

for all Borel sets Bj © Kj, j =1,...,k. 

Proof For Borel sets Bj C Kj, j= 1,...,k, and f := 18, x...xg, define 
®D(f) := Pp, 0-0 P5.. 


We extend this definition by linearity to functions f on K which are linear combinations 
of Borel rectangles. Using the commutativity assumptions it is easily checked that this 
is well defined and that for such f we have 


I®(A)I < Il fllee- 


We may use this to define, for functions f € C(K), a well defined bounded operator 
®(f), by uniform approximation by simple functions of the above form. In the same 
way as in the proof of Theorem 9.14, there exists a projection-valued measure P on K 
such that (9.3) holds for all f € C(K) and x € H, that is, 


(®(f)x|x) = I faP,, fEC(K), xEH. 
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This projection-valued measure has the desired properties. Its uniqueness can be proved 
using the method of Proposition 9.13. 


Theorem 9.24 (von Neumann). Two selfadjoint operators T,,T, © @(H) commute if 
and only if there exist a normal operator S € (H) and continuous functions f, fr : 
o(S) — R such that 


T,=fi(S), Ta = fa(S). 


Proof The ‘if’ part follows from the multiplicativity of the Borel calculus of S. The 
point is to prove the ‘only if’ part. To this end let P! and P? denote the projection-valued 
measures of Tj and 7) on o(7;) and o(7)) respectively, and let P be the projection- 
valued measure on K := 0(T;) x (72) C R* as in Lemma 9.23. Let L= {z€ C: Reze 
o(T,), Imz€ o(T%)} CC, that is, we identify K with a rectangle L in the complex plane. 
Under this identification, P induces a projection-valued measure on L, denoted by Q. 
The operator 


Gage [Aa0a) 


is normal. We will prove that 7) = fi(S) and 7) = f2(S) with fi (z) = Rez and fo(z) = 
Imz. For all x € H we have, using that Q, is supported on o(S), 


(A(Sxi) =f, A(A)dOW(A) = f ReAdOx(A). 


We claim that the image measure of Q, under f, equals P!. Indeed, for all Borel sets 
B, C o(T;) we have 


Fi (Px)(B1) = Ox(Bi +10 (T2)) = (Qa, +10(7») +") 
= (Pp; x0(1)*1*) = (Pa, Pocryy*l*) = (Payal) = Pr (Bi), 


where we used that P?(o(T>)) =J. So fi(Qx) = P! as claimed. It now follows that 
[fi@yeaay= [ware (n) = (His). 
L o(71) 


Together with the above identities we find that (f;(S)x|x) = (T1x|x). This being true for 
all x € H, we conclude that f;(S) = 7). The identity f.(S) = T) is proved similarly. 


9.5 The Von Neumann Bicommutant Theorem 


In this section we prove a result of fundamental importance in the theory of operator al- 
gebras, von Neumann’s celebrated bicommutant theorem. We also start the proof, to be 
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completed in the next chapter, of the identification of the bicommutant of a single self- 
adjoint operator on a separable Hilbert space as being precisely the bounded functional 
calculus of this operator. 

We begin by introducing the relevant termi- 
nology. 


Definition 9.25 (Commutant). The commutant 
of a subset 7 C &(H) is the set 


F':={SEL(H): ST=TS, TE TF}. 


The bicommutant of a subset 7 C &(H) is the 
set J" :=(F') 


Definition 9.26 (Strong and weak operator 
topologies). The strong operator topology on 
-£(H) is the smallest topology t on 2(H) with 
the property that the linear mapping T +> Tx 
is continuous for all x € H. The weak operator 
topology on &(H) is the smallest topology Tt 
on -£(H) with the property that the linear mapping T ++ (Tx|y) is continuous for all 
x,y CH. 


John von Neumann, 1903-1957 


Definition 9.26 has natural counterparts for (X) with X a Banach space, but this 
will not be needed. 

In the same way as was explained in Section 4.6 for the weak and weak” topologies, 
the strong operator topology is generated by the sets of the form 


{T € Y(H) :||(T —T)x|| < e} 


with € > 0,x € AH, and Ty € #(H), and likewise the weak operator topology is generated 
by the sets of the form 
{T € £(H) : |((T — To)aly)| < €} 


with € > 0, x,y € H, and Ty € Z(H). 

For every set 7 C ¥(H), the commutant 7’ is closed in the weak operator topology. 
To see this, suppose that 7) ¢ .7’. Then there exist an operator S € .7, vectors x,y € H, 
and a number 6 > 0 such that |(Z>Sx|y) — (STox|y)| = 6. The set 


MU :={T € L(A): |((To—T)Sx\y)| < 6/2, |((To — T)x|S*y)| < 5/2} 


is open in the weak operator topology, contains 7p, and every T € Y fails to commute 
with S. It follows that YN 7' =o. 
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Recall that a subalgebra of @(H) is a subspace of (H) closed under taking compo- 
sitions. A x-subalgebra of &(H) is a subalgebra of (H) closed under taking Hilbert 
space adjoints. A subalgebra is said to be unital if it contains the identity operator. 


Theorem 9.27 (von Neumann bicommutant theorem). For a unital x-subalgebra & of 
-L(H) the following assertions are equivalent: 


(1) =a"; 
(2) & is weakly closed; 
(3) & is strongly closed. 


A x-subalgebra & of Y(H) which is closed with respect to the operator norm is 
called a C*-algebra. This is not the commonly used definition (the standard definition 
is mentioned in the Notes to Chapter 7), but one of the main theorems on the structure 
of C*-algebras establishes that this definition is equivalent to the standard one. A unital 
x-subalgebra & of L(H) satisfying the equivalent conditions (1)—(2) of Theorem 9.27 
is called a von Neumann algebra. 


Proof The implications (1)=>(2)=(3) are clear. 

(3)=(1): Fix x9 € A and let P denote the orthogonal projection in H onto the closure 
Y of the subspace {Txo : T € &}. Since & contains the identity operator J we have 
Pxo = Xo. 

We claim that Y is invariant under all T € &: for if T€ w andy € Y, say y= 
limp oo Txo with T, € & for all n > 1, then Ty = lim,.TT,xo with TT, € & for all 
n> 1, and therefore Ty € Y. Similarly Y + is invariant under all T € o&/: for if TE A 
and x € Y+, then for all y € Y we have (Tx|y) = (x|T*y) =0 since T* € & and therefore 
T*yeyY. 

By the claim, for all T € & and x € H we have TPx € Y and T(U/—P)x€ Y+, and 
therefore 


TPx = PT Px = PT(Px+(1—P)x) =PTx, x€H. 
We conclude that TP = PT for all T € &, that is, PE 
Let Ty) € &" be fixed. Then PT) = TyP since P € ./’, and this implies that Toxo = 


ToPx9 = PToxo € Y. By the definition of Y this means that for all € > 0 there exists an 
element T € & such that 


|| Toxo — Txo]|| cae oF 


To show that every strongly open set containing 7p intersects . it suffices to show that, 
for any choice of x1,...,x, € H and € > 0, there exists T € & such that 


|(To—T)xjl|<e, fHl,...,k. (9.7) 


In what follows we set xg := (x1,..-,Xx)- 


304 The Spectral Theorem for Bounded Normal Operators 
For S € Y(H) let p(S) € Y(H") be given by 
p(S)(A1,...,4e) = (Shy,...,Shx). 
We claim that 
p(To) € (p(#))". 


Indeed, suppose that S = (Sis)E jaa € (p(@)). This means that So(T)y = p(T)Sy for 
all T € & and y = (y1,..., 9%) € AX that is, 


k k 
} SiiTy; = ys TSijyj, — 1,...,k, V15-++ 5 Vk EH, 
j=l j=l 

which implies that for all 1 < i,j < k we have S;; € {T}’ for all T € &, so S;j € &. 
But this clearly implies that p (7) commutes with S. 

We now apply the first part of the proof, with H, .o/, and Ty replaced by H*, p(.o/), and 
p (Zo) respectively. This gives an operator T € & such that ||(e(7o) — p(T))xol| < €, 
that is, 


k 
¥ | (Zo- 7) x;\I? < €”. 
j=l 


In particular, (9.7) follows from this. 

We have shown that every strongly open set containing an element from ./” intersects 
of. This means that ./ is strongly dense in &/”. Since & was assumed to be strongly 
closed, it follows that 7% = &/". 


The next theorem provides a beautiful connection between bicommutants and the 
bounded functional calculus. 


Theorem 9.28 (von Neumann, bicommutant of a selfadjoint operator). Let H be sepa- 
rable and let T € &(H) be selfadjoint. Then 


{T}" = (f(T): f € Bo((T))}- 

Proof The inclusion ‘D’ holds for normal operators T and arbitrary Hilbert spaces H, 
and is proved as follows. The Fuglede—Putnam—Rosenblum theorem (Theorem 8.18) 
implies that T* € {T}". It follows that every operator of the form p(T,7T*), with p a 
polynomial in z and Z, is contained in {7}. By the Stone—Weierstrass theorem, the 
same is true for every function f € C(o(T)). By pointwise approximation from below, 
the result extends to indicator functions f = 1y with U C o(T) relatively open. For all 
x € H, the outer regularity of P, implies that Pgx = lim,_,.. Py,.x whenever the relatively 
open sets Uj D Up D--: DB satisfy limp... P,(U, \ B) = 0. Applying this to Sx and x 
with S € {T}’, as a consequence we obtain 


PpSx = lim Py, Sx = lim SPy,x =SPpx, xEX, 
noo n—-o0o 
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which shows that Pg € {S}’ for all S € {T}’, that is, Pg € {T}”. This, in turn, implies 
that if f € B,(o(T)) and S € {T}’, then, upon approximating f with simple functions, 


=5( ff, faP) = (J, ,,faP)s= f(T)S 


and therefore f(T) € {T}” for all f € By(o(T)). 
The proof of the inclusion ‘C’ will be given in the next chapter, as it uses the spectral 
theory for unbounded selfadjoint operators developed in that chapter. 


9.6 Application to Orthogonal Polynomials 


In this final section we present an interesting application of the spectral theorem to or- 
thogonal polynomials. Let 1 be a Borel measure on the real line satisfying the condition 


if ” xl" du(x) <0, néN. (9.8) 


We further assume that the support of u is not a finite set. Suppose that (pp )nen is a se- 
quence of polynomials with real coefficients satisfying the following two assumptions: 


(i) for all n € N we have deg(p,) =n; 


Gi) for all m,n € N with m £n we have the orthogonality relation 
[Pal pul) du(s) =0. 9.9) 
For n = 0, (i) is understood to mean that po ¥ 0. By linearity, (i) and (ii) imply 
[. x" pn(x)du(x) =0 whenever m <n. 


Proposition 9.29. Let u be a Borel measure on the real line with the properties stated 
above. For any sequence of polynomials (Pn)nen with real coefficients satisfying the 
conditions (i) and (ii), there exist real numbers An, Bn,Cy (n € N) satisfying Co = 0 and 
An—1Cn > 0 (n > 1) such that, with p_; =0, 


XPn =AnPn+it+BnPnt+Copn-1, neEN. 


Proof Since xp, is a polynomial of degree n+ 1, it is of the form xp, = 3a, CjnPj 
with all coefficients c;, real-valued. Then (9.9) implies 


[. XPn(X)Pm(x) du (x) = Cin,nNin, where Ni, = a Pin (x) du (x). 
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Since the support of 1 is not finite we have N,, 4 0 for all m € N. For n € N the poly- 
nomial p,(x) is orthogonal to xp,,(x) for allm=0,1,...,n—2. This forces cm, = 0 for 
allm =0,1,...,2—2. This, in turn, implies 


XPn = Catt nPn+1 +CanPn+Cn-1nPn-1, ne N. 


This gives the three point recurrence relation with Ay = Cnt4in, Bn = Cnn, and C, = 
Cn—1,n, With convention Co = c_1,9 = 0, say. Since the degree of xp, is n+ 1 we have 
A, #0. Also, forn > 1, 


An—1Nn = Cnn—1Nn = / XPn (x)Pn-1 (x) du (x) 
oe (9.10) 


-/ XPn—1(X) Pn(x) du (x) = Cn—1ynNn—1 = CaNn-1, 


and therefore A,_1C, > 0 forn > 1. 


The polynomial p, has norm one in L?(R,w) if and only if N, = 1. Hence if the 
Pn are orthonormal, (9.10) gives 0 4 A,_1 =C, for all n > 1. As an application of the 
spectral theorem we show that, conversely, for every sequence of polynomials satisfying 
the three point recurrence relation subject to the conditions 0 4 A,_; = Cy for alln > 1, 
and satisfying the additional boundedness assumption 


supmax{|Ay|,|Bn|,|Cn|} <°%, 
neN 


there exists a finite Borel measure on the real line with respect to which the polyno- 
mials are orthonormal. 


Theorem 9.30 (Three-point recurrence). For every sequence (Pn)ncn of polynomials 
satisfying the three point recurrence relation 


XPn =AnPn+1 t BuPn CnPn—-1; ne N, 
with p_; = 0, subject to the conditions 0 4 Ay_-| =C, for alln > 1 and 


supmax{|A,|,|Bn|,|Cn|} <°%, 
neN 


there exists a finite Borel measure p on the real line which satisfies (9.8) and such that 
the sequence (Pn)nen is orthonormal in L?(R, LU). 


Proof Without loss of generality we may assume po = 1. 
On the Hilbert space (7(N) we consider the bounded operator 


Ten = Anen41 +Bnen+Cren-1, neEN, 


9.6 Application to Orthogonal Polynomials 307 


with the understanding that Teo := Ape; + Boeo; boundedness of T is a consequence of 
the boundedness assumption on A,, By, and C,. Since A,_; = Cy, from 


(€n|T* em) = AnOnn-+1 “f Bndmn +An—16nn—1 
= Am—19m—1,n ate BmOmn +AmOm-+1n = (€n|Tem) 


(which is checked by hand also to hold if n = 0 or m = 0) we see that T is selfadjoint. 
Let P be its projection-valued measure and define pt := P.,. Then w is a finite measure 
supported on o (7), which is a compact set since T is bounded. 

Define a linear operator from (4)(N), the span of the vectors e, in (?(N), into L?(R, U1) 
by setting Ven := pn forn € N. We further define the bounded operator M : L7(R,W) > 
L’(R,u) by Mf(x) := xf (x); the boundedness of M follows from the fact that u is 
supported in a bounded interval J. From 


UTen = AnPn+i + BaPn+An-1Pn—-1 = XPn = MUe, 


we see that UT = MU as linear operators from (2)(N) to L7(R, “). By a simple induc- 
tion argument, UT” = M"U for alln EN. 

We claim that U extends to a unitary operator from (7(N) to L?(IR,w). First we 
check that U has dense range. By the Stone—Weierstrass theorem, the functions €(x) := 
ertkx/\!|_ k € Z, can be uniformly approximated by polynomials, and the injectivity of 
the Fourier transform of finite Borel measures (Theorem 5.32) implies that the span 
of the functions €, k € Z, is dense in LA i), hence in L?(R,y). These observations 
imply that U has dense range. Next, from UT"e9 = M" po = x" and u = P., we obtain 


(UT™e|UT"eo) = (x"|x") = | x" AP, (x) = (T"™eoleo) = (T™eo|T"eo) 
I 


using the functional calculus of 7. The span of the vectors T”eo, n € N, being dense in 
@(N), this concludes the proof that U extends to a unitary operator. It now follows from 


(Pm|Pn) = (Uem|Uen) = (emlen) = Onn 


that the polynomials p, are orthonormal in L?(R, 11). 


Example 9.31. We have already encountered two examples of orthogonal polynomials. 


(i) The Hermite polynomials H,,, n € N, have been introduced in Section 3.5.b. They 


are orthogonal with respect to the Gaussian measure —— exp(—5x") dx on the 


V2n 


real line and satisfy the three point recurrence relation Ho(x) = 1, Hi (x) =x, and 
Ay+42(x) = xHn4i(x)— (n+ 1)An(x), n EN. 


(ii) The monic Laguerre polynomials L,, n € N, have been introduced in Problem 
3.14 as a scaled version of the Laguerre polynomials. They are orthogonal with 
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respect to the measure 1g, (x) exp(—x) on the real line and satisfy the three point 
recurrence relation Lo(x) = 1, Li(x) =x—1, and 


Lny2(x) = (x—2n43)Lnsi(x) —(n+1)*Ln(x), 2 EN. 


Problems 


9.1 Let T € &(H) be a compact normal operator with spectral decomposition T = 
Yn>1AnPr. Prove that if f : o(T) — C is a bounded function, then 
F(T)x= Yo fAn)Prx, x € H. 
n>1 
Does the sum ¥’,,5; f(An)Pn converge to f(T) in the norm of -(H)? 
9.2 Let T € &(H) be acompact normal operator. Give a direct proof (that is, without 
invoking Theorem 7.11) of the following two statements: 


(a) If A is a nonzero element of o(T), then A is an eigenvalue. 
Hint: Use the fact that R(A — T) is closed (Lemma 7.9) to establish the equiv- 
alences N(A —T) = {0} 6 N(A—T*) = {0} & R(A—T) =H. 

(b) If T has infinitely many distinct eigenvalues A, then lim, 5.0 A, = 0. 
Hint: Choose eigenvectors Th, =A,h, and define Ho := {0} and H,, := 
span {f,...,4,} form = 1,2,... Show that H,_1 © H;, and choose norm one 
vectors x, € Hy gts eane Show that if n > m, then ||Txm—TXxn|| > |An|. 


n 


9.3 Let T € (A) bea selfadjoint operator with projection-valued measure P. Prove 
the following results. 


(a) If t € R, then for all x € H we have 
lim i€R(t + ie, T)x = Pryx. 
lim ié (t+ie€,T)x = Pryx 
(b) If a,b € R with a < b, then for all x € H we have Stone’s formula 


ee Oe i ; I 
Linn ah R(t —ie,T)x—R(t+ie,T)xdt = 5 Pia.oi* + Pa.b)%)- 


Hint: Show first that 


0, t¢{(a,dl, 
lim i i. : Z dtr=¢ ! 2 ; 
el0 2ti Jag A-—ie-t A+ie-t x, t€ {a,b}, 
1, t€ (a,b) 


9.4 LetT,T € L(H) be selfadjoint operators with projection-valued measures PC) 
and P(), respectively. Prove that the following assertions are equivalent: 


9.5, 


9.6 


9.7 


Problems 309 


(1) the projection-valued measures PO) and Pl) commute, that is, for all Borel 
sets B,,Bz in R we have 


1) p(2 2) pl 
pl) p2) — pO) pl, 


(2) the resolvents of 7; and T, commute, that is, for all A; € p(T1) and Ax € p(T) 


we have 
R(Ay,T))R(A2, Th) = R(A2, D)R(A1, Th); 
(3) for all t),f2 € R we have 
exp(it, 7 ) exp(it2T) = exp(it2Tr) exp(itiT)). 
Hint: For the implication (2)=(1) use the results of the preceding problem; for 
implication (3)=(1) write exp(it,T,) and exp(it2T2) as spectral integrals with re- 
spect to P\) and P®) and use the properties of the Fourier—Plancherel transform 


to deduce that for all f,g € F7(IR) we have f(T:)8(1) = 2) f(T) and hence, 
for all f,g € F°(R), 


f(Ti)8(T2) = g(t) f(T). 
By approximation with functions in .¥7(IR), deduce that 


(1) (2) _ (2) (1) 
Pas sbi)! (a,b) - F osha)! (arbi) 


for all aj, a2,b,,b2 € R with ay < by and a2 < bo. 
Let T € &(H) be a normal operator with projection-valued measure P. Let B be 
a Borel subset of o(T). 

(a) Show that T leaves the range of Pg invariant. 

(b) Show that o(T|gp,)) CB. 
Prove that if T € (H) is normal, then 


||7 || = sup{|(Tx1x)] = [ll] = 1. 


Hint: Fix Ag € o(T) with |Ao| = ||T|| (why does such Ao exist?) and € > 0, and 
consider the projection Pg(,,.¢), where P is the projection-valued measure of T. 
Show that if x € B(Ao;€) has norm one, then |(Tx|x)| > ||T7'|| —e. 
Let T € (H) be a normal operator with projection-valued measure P. 
(a) Show that N(T) = R(P{o}). 
Hint: For the inclusion C, write o(T) \ {0} as a countable union of Borel sets 
B,, each of which has the property that inf{|A| : A € B,} > 0, and consider 


oe WF Ze Be, 
Sake) = es 2€0(T)\Bp. 
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(b) Conclude that if A € o(T) is an eigenvalue, then P;,} equals the orthogonal 
projection P, onto the eigenspace Ey. 
(c) Conclude that if A € o(T) is an isolated point, then A is eigenvalue and 
Pia; = P, equals the spectral projection associated with {A }. 
Show that the space L®(Q,, P) introduced in Definition 9.10 is a Banach space. 
Prove the following two claims made in Example 9.16: 


(a) The position operator X on H = L7(0, 1) defined by 


Xf(x):=xf(x), x€ (0,1), fEL7(0,1), 
has spectrum o(X) = (0, 1]. 
(b) The projection-valued measure of X is given by Pgf = 1gf for all Borel 
subsets B of [0, 1] and f € L7(0,1). 
Find the projection-valued measures of the following unitary operators: 
(a) the right shift on 7(Z); 
(b) translation over t on L7(R); 
(c) rotation over @ on L?(T); 
(d) the Fourier transform on L?(R¢). 
Hint: For parts (c) and (d) revisit Problems 8.14 and 6.9, respectively. 
Let k € L?((0,1) x (0,1)) satisfy k(s,t) = k(t,s) almost everywhere. Find the 
projection-valued measure of the selfadjoint integral operator T on L7(0, 1), 


Tf(t):= [Hes s0es. fe L(0,1). 


Let T € &(H) be selfadjoint. 
(a) Show that there exist positive operators T+ and T~ such that T= Tt —T-. 
Hint: Consider the functions ft (t) :=17 and f(t) :=r-. 
(b) Show that these operators are unique if we also ask that T*T~ = T~T* =0. 


Prove that if U € &(H) is unitary, there exists a selfadjoint operator T such that 
U =e" and o(T) C [-2,z]. Is this operator T unique? 

Hint: Write U = fcr) AdP(A) as in Theorem 9.14. Find a projection-valued mea- 
sure Q on [0, 1] whose image under t +> e" is P. 

This problem outlines another proof of the spectral theorem for normal operators. 


(a) Explain how the proof of the spectral theorem for normal operators simplifies 
for selfadjoint operators. 

(b) Deduce the spectral theorem for normal operators from the selfadjoint case 
by considering, for a normal operator 7, the selfadjoint operators 5(T +T*) 
and x(T —T*), and applying Lemma 9.23 to their projection-valued mea- 
sures. 


10 


The Spectral Theorem for Unbounded Normal 
Operators 


Up to this point we have been dealing exclusively with bounded operators. In order 
to make the functional analytic apparatus applicable to the study of partial differential 
equations we need to accommodate differential operators into the theory. This leads 
to the notion of an unbounded operator as a linear operator defined only on a suitable 
subspace, the domain of the operator. Of special interest are unbounded selfadjoint and 
normal operators, and the main goal of this chapter is to extend the spectral theorems of 
the preceding chapter to these classes of operators. 


10.1 Unbounded Operators 


Throughout this chapter, X and Y are Banach spaces. 


Definition 10.1 (Linear operators). A linear operator from X to Y is a pair (A,D(A)), 
where D(A) is a subspace of X and A: D(A) — Y is a linear operator. The subspace 
D(A) is called the domain of A. A linear operator is densely defined when D(A) is a 
dense subspace of X. 


When no confusion is likely to arrive we omit the domain from the notation and write 
A instead of (A,D(A)). 

It is perfectly allowable that D(A) = X, so in particular every bounded operator A : 
X — Y is a linear operator in the sense of the above definition. More generally it may 
happen that there exists a constant C > 0 such that ||Ax|] < C||x|| for all x € D(A). In 
this situation, A admits a unique extension to a bounded operator (of norm at most C) 
defined on the closure of D(A). The interest in the above definition arises from the 
fact that many interesting examples of unbounded linear operators exist, that is, linear 
operators for which such a constant C does not exist. Typical examples, treated in more 
detail below, include differential operators and multiplication operators with unbounded 
multipliers. 
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The terms ‘linear operators’ and ‘unbounded operators’ are often used interchange- 
ably. With the latter terminology, however, it becomes somewhat ambiguous whether 
bounded operators are to be considered as special cases of unbounded operators. To 
avoid such trivial issues we generally prefer the terminology ‘linear operator’, which is 
neutral in this respect. 


10.1.a Closed Operators 


A linear operator from X to Y is bounded if and only if its graph is closed in X x Y, 
the ‘if’ part being the content of the closed graph theorem. This motivates the following 
definition. 


Definition 10.2 (Closed operators). A linear operator A from X to Y is called closed 
when its graph 


G(A) := {(x,Ax) : x € D(A)} 
is closed in X x Y. 


Every bounded operator from X to Y is closed, and by the closed graph theorem a 
closed operator with domain D(A) = X is bounded. 

If A is a linear operator with domain D(A), then A is bounded (in fact, contractive) as 
an operator defined on the normed space D(A) endowed with the graph norm 


llello(ay = [lal + IIAxl], x € D(A). 
This follows from the trivial inequality 
I|Axl] < [lal] + [Axl = Ill). 


The following proposition gives a necessary and sufficient condition for closedness in 
terms of the graph norm. 


Proposition 10.3. A linear operator is closed if and only if its domain is a Banach 
space with respect to its graph norm. 


Proof ‘If’: Suppose that the domain D(A) of the linear operator A is complete with 
respect to its graph norm. To prove that A is closed we must show that its graph is 
closed, or equivalently, sequentially closed, in X x Y. Let (%n,Axn)n>1 be a sequence 
converging to some limit (x,y) in X x Y. We must check that (x,y) belongs to the graph 
of A. By the properties of product norms, we have x, — x in X and Ax, > y in Y. 
In particular, the sequences (X,)n>1 and (Ax;)n>1 are Cauchy in X and Y respectively. 
Then the sequence (x, )n>1 is Cauchy in D(A) since 


IlXn — XmllD(a) = [Xn — Xml| + ||AXn — AXm|]| 4 0 as m,n — oe. 
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By the completeness of D(A), the sequence (x»)n>1 converges in D(A), say x, > x’ in 
D(A). This means that x, — x’ in X and Ax, — Ax’ in Y. Comparing limits we find that 
x’ =x and Ax’ = y. This means that (x,y) = (x’,Ax’) belongs to D(A). 

‘Only if’: Assume now that A is closed and that (x,)n>1 is a Cauchy sequence in 
D(A), so (Xn)n>1 is a Cauchy sequence in X and (Axn)n>1 is a Cauchy sequence in Y. 
Then ((x,,A%n))n>1 is a Cauchy sequence in X x Y. This Cauchy sequence is contained 
in the graph of A. This graph is closed by our assumption, and since closed subspaces 
of Banach spaces are Banach spaces, this Cauchy sequence converges in X x Y toa 
limit contained in the graph of A, say (%,,Ax,) — (x,Ax) in X x Y. This implies that 
Xn > xin X and Ax, — Ax in Y, which is the same as saying that x, — x in D(A) with 
respect to the graph norm. We have thus shown that every Cauchy sequence in D(A) is 
convergent. 


The following proposition gives a convenient sequential criterion for a linear operator 
to be closed. The proof is already implicitly contained in the proof of Proposition 10.3 
and the reader is invited to make this explicit. 


Proposition 10.4. A linear operator A with domain D(A) is closed if and only if the 
following holds: whenever Xn — x in X, with x, € D(A) for all n, and AxXn > y in Y, then 
x € D(A) and Ax =y. 


This criterion is used to prove closedness in the next two examples. 


Example 10.5. The derivative operator, as a linear operator in C[0,1] with domain 
C'[0, 1], is densely defined and closed. The density of C![0, 1] in C[0, 1] is clear (by the 
Weierstrass approximation theorem we can even approximate continuous functions with 
polynomials). To prove closedness, suppose that f, — f in C[0, 1], with f, € C![0, 1] for 
all n, and f — gin C[0, 1]. We must prove that f € C!(0, 1] and f’ = g. For all x € [0,1] 
we have 


Fle) — FO) = fim fu(x) — fu(0) = Jim [H0)¢y = fe) 4, 


n—oo 


using the uniform convergence of f7 to g in the last step. The right-hand side is a con- 
tinuously differentiable function, with derivative g. This proves that f € C![0,1] and 


f=. 
An analogous result holds for weak derivatives in L?(D), where D is an open subset 
of R¢; see Section 11.1.a. 


Example 10.6. Let (Q,.¥, “) be a measure space and let 1 < p < ce. Given a measurable 
function m : Q — K we may define 


D(Am) := {f € L?(Q): mf € L?(Q)}, 
Amf :=mf, f €D(Am). 
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We claim that the linear operator A,, is closed. Moreover, if 1 < p < ©, then A,, is 
densely defined. 

To prove closedness, let f, > f in L?(Q) with each f, in D(A,,) and Amfn > g in 
L?(Q.). We must show that f € D(A,,) and A,,f = g. By passing to a subsequence we 
may assume that both convergences also hold pointwise U-almost everywhere. Then, 
for U-almost all @ € Q, 

8(@) = lim Amfn(@) = lim m(@) fn(@) = m(@) f(@). 


n—oo n—yoo 


This proves that mf € L?(Q) and mf = g U-almost everywhere, hence as elements of 
L?(Q). Equivalently, this says that f € D(A,,) and A, f = g. 

Now let 1 < p < e. By dominated convergence, limy—oo1{\m| <n} f = f for all f € 
L?(Q), with convergence in the norm of L?(Q). Since 14)m)<yyf € D(Am), this shows 
that D(Am) is dense in L?(Q). 


Example 10.7. If A is a closed operator and B is bounded, then the operator A + B with 
domain D(A + B) := D(A) defined by (A + B)x := Ax + Bx for x € D(A) is closed. The 
easy proof is left as an exercise. 


Example 10.8. If A is an injective closed operator (in particular, if A is an injective 
bounded operator), its inverse A~!, with domain D(A~!) = R(A), is closed. This is im- 
mediate by noting that the graph of A~! equals 


{(y,A~ly): ye D(A“')} = {(Ax,x) : x € D(A)} 
and that the latter is closed in Y x X since {(x,Ax) : x € D(A)} is closed in X x Y. 
Further examples will be given later on. We highlight two of them: 


Example 10.9. The adjoint A* of a densely defined linear operator A acting between Ba- 
nach spaces is closed by Proposition 10.18. Likewise, by Proposition 10.22, the Hilber- 
tian adjoint A* of a densely defined linear operator A acting between Hilbert spaces is 
closed. 


Example 10.10. Generators of Co-semigroups are closed by Proposition 13.4. 


It frequently happens that linear operators are initially defined on a ‘too small’ domain 
to be closed, but can be extended to a closed operator on a larger domain. Typical 
examples of this situation arise in connection with differential operators, which initially 
can be defined on compactly supported smooth functions only. 

When A and B are linear operators satisfying D(A) C D(B) and Ax = Bx for all x € 
D(A), we call B an extension of A, notation: 


ACB. 
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Definition 10.11 (Closability and closure). A linear operator is said to be closable if 
it has a closed extension, or equivalently, if the closure of its graph is the graph of a 
linear operator. The unique linear operator A whose graph is the closure of the graph of 
a closable operator A is called the closure of A; it is the smallest closed extension of A. 


We have the following analogue of Proposition 10.4: 


Proposition 10.12. A linear operator A with domain D(A) is closable if and only if the 
following holds: whenever x, — 0 in X and AX, — y in Y, with all x, in D(A), then 
y=0. 


Proof We only need to prove the ‘if’ part, the ‘only if’ part being trivial. Denote the 
closure of G(A) by G. We must prove that, under the stated condition, G is the graph of a 
linear operator B. This is the case if and only if (x,y) € G, (x,y2) € G implies y; = yo, 
for in that case we may define D(B) to be the set of all x € X such that (x,y) € G; 
for x € B we may then define Bx := y, where y € Y is the unique element such that 
(x,y) € G. By a limiting argument, the linearity of A implies that the operator B thus 
defined is linear. Clearly it extends A and its graph G(B) = G is closed. 

Suppose, therefore, that (x,y) € G and (x,y2) € G. Then (0,y; — y2) € G since G 
is a linear subspace of X x Y, and this means that there exists a sequence (X,,AXn) > 
(0,¥1 —y2) in X x Y. But then x, + 0 in X and Ax, > y; — y2 in Y. By our assumption, 
this forces yj — yz = 0. 


Example 10.13. In the setting of Examples 10.5 and 10.6, a closable operator is ob- 
tained by replacing the domain D(A) of the operator A by any smaller subspace Y. The 
closure of the operator thus obtained equals A if and only if Y is dense in D(A) with 
respect to the graph norm. 


Example 10.14. It is shown in Proposition 10.34 in the next section that every densely 
defined symmetric operator acting in a Hilbert space is closable. 


Example 10.15. Let D be an nonempty open subset of Ré@ let1< p<, and let a € Né 
be a multi-index. In L?(D) we consider the linear operator A with domain CP (D) defined 
by 


Af :=o°f, feEcp(D), 


where 0% = 0, lo..-o ay4 , with 0; = 0 /Ox; the jth directional derivative. We claim that 
A is closable. Indeed, suppose the functions f, € C?(D) satisfy f, + 0 and Af, — g in 
L?(D). Integrating by parts, for all @ € C?(D) we obtain 


| soar= lim | (0% fn)odx = (-1) lim | f.d%9 dx =0, (10.1) 
D n—-eo D n—-oo D 


where || = O +---+Qq; the last step follows by Hélder’s inequality (cf. Corollary 
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2.25). It is shown in Proposition 11.5 in the next chapter that (10.1) implies g = 0 
almost everywhere. 


Example 10.16. Let D be an nonempty open subset of R@ and let 1 < p < ~. In L?(D) 
we consider the linear operator A with domain C?(D) defined by 


Af :=Af, feEcy(D), 


where Af = 0? foot ay f is the Laplacian of f. In the same way as in the previous 
example one shows that this operator is closable. Various explicit descriptions of its 
closure can be given, some of which are discussed in Chapters 11-13; see in particular 
Sections 11.1.e, 12.3, and 13.6.c. 


10.1.b The Adjoint Operator 


When A is a densely defined linear operator from X to Y, we may uniquely define a 
linear operator A* from Y* to X* by defining its domain D(A”) to be the set of all 
y* € Y* with the property that there exists an element x* € X* such that 


(x,x*) = (Ax,y*), x € D(A). 
Since D(A) is dense in X, the element x* € X* (if it exists) is unique and we can set 
A*y':=x", ye D(A’). 
Thus, by definition, we have the identity 
(Ax,y") = (x, A*y"), x € D(A), y* € D(A’). 
Definition 10.17 (Adjoint operator). The operator A* is called the adjoint of A. 


The adjoint of a closable densely defined operator A equals the adjoint of the clo- 
sure A, for if x* € X* and y* € Y* are such that (x,x*) = (Ax,y*) for all x € D(A), by 
continuity this identity extends to all x € D(A). 


Proposition 10.18. If A is a densely defined linear operator from X to Y, then A* is 
weak* closed in the sense that its graph is weak* closed in Y* x X* and we have 


G(A*) = (J(G(A))), 


where J: X x Y + Y xX is defined by J(x,y) = (—y,x). If A is densely defined and 
closed, then A* is weak* densely defined in the sense that its domain is weak* dense. 


Proof The pairing 
((*)s(¥",")) = (y,y") + 2") 
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allows us to identify Y* x X* with the dual of Y x X. By the definition of the adjoint 
operator we have (y*,x*) € G(A*) if and only if 

((—Ax,x), (v",2")) =0, x € D(A). 
This proves the identity G(A*) = (J(G(A)))+. Since annihilators are weak* closed, this 
proves that A* is weak* closed. 

If D(A*) is not weak* dense, then by Proposition 4.44 there exists a nonzero element 
yo € Y such that (yo, y*) = 0 for all y* € D(A*). By assumption G(A) is closed in X x Y 
and (0, yo) ¢ G(A). It follows that J(G(A)) is a closed subspace of Y x X not containing 
J(0,¥0) = (—yo,0), so by the Hahn—Banach theorem there exists an element (4,.x5) € 
Y* x X* annihilating J(G(A)) but not (—yo,0). In other words, 


(x,x6) = (Ax,yp), x € D(A), 
and 
(y0,¥o) #9. 


The first equality implies that yj € D(A*), so the second one implies that yo does not 
vanish against every element of D(A*), contradicting the choice of yo. 


If the linear operators A and B act from X to Y, we define the operator A + B acting 
from X to Y by 
D(A+B) := D(A) ND(B), 
(A+B)x:=Ax+Bx, x@€D(A+B). 
If A acts from X to Y and B acts from Y to another Banach space Z, we define the 
operator BA acting from X to Z by 
D(BA) := {x € D(A): Ax € D(B)}, 
BAx := B(Ax) x € D(BA). 
There is of course a priori no guarantee that D(A + B) and D(BA) contain any nontrivial 


elements even when both A and B are densely defined. 


Proposition 10.19. Let A and B be densely defined operators acting in the ways indi- 
cated above. Then: 


(1) ifA CB, then B* D A*; 
(2) ifA+B is densely defined, then A* + B* C (A+B)*, with equality if B is bounded; 
(3) if BA is densely defined, then A* B* C (BA)*, with equality if B is bounded. 


Proof Part (1) is immediate from the definitions. 
If y* € D(A* + B*) = D(A*) N D(B*), then for all x € D(A+B) = D(A) ND(B) we 
have ((A + B)x,y*) = (Ax,y*) + (Bx, y*) = (x, A*y* + B*y*), so y* € D((A+B)*) and 
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(A+B)*y* = A*y* + B*y* If B is bounded and y* € D((A+8B)*), then for all x € D(A+ 
B) = D(A) we have (Ax, y*) = ((A+B)x,y*) — (Bx, y*) = (x, (A+ B)*y*) — (x, B*y*), so 
that y* € D(A*) = D(A* + B*) and A*y* = (A+ B)*y* — B*y*. This gives (2). 

If z* € D(A*B*), then z* € D(B*) and B*z* € D(A*), and for all x € D(BA) we have 
((BA)x,z*) = (Ax, B*z*) = (x,A*B*z*), so that z* € D((BA)*) and (BA)*z* = A*B*z* If 
B is bounded and z* € D((BA)*), then for all x € D(BA) = D(A) we have (Ax, B*z*) = 
(BAx,z*) = (x,(BA)*z*), so B*z* € D(A*) and z* € D(A*B*) and A*B*z* = (BA)*z*. 
This gives (3). 


We have the following useful duality criterion to decide whether an element belongs 
to the domain of an operator. 


Proposition 10.20. Let A be a densely defined closed operator from X to Y. Ifx EX 
and y € Y are such that (y,y*) = (x,A*y*) for all y* € D(A‘), then x € D(A) and Ax =y. 


Proof By the Hahn—Banach theorem, the result follows once we have checked that 
((x,y), (x*,y*)) =0 for all (x*,y*) € (G(A))+. Indeed, this gives 


(A), 


where the first identity follows from Proposition 4.45 and the second from the fact 
that closed subspaces are weakly closed by the Hahn—Banach theorem (see Proposition 
4.42). 

Fix an arbitrary (x*,y*) € (G(A))+. For all x € D(A) we have (x,Ax) € G(A) and 
therefore 


(x,y) € +((G(A))+) = Gay" =6 


0 = ((2,Ax), (@",9")) = 042") + (Axy"). 
This means that y* € D(A*) and A*y* = —x*. Hence, 


(x,y), (x",y")) = (&, -A*y*) + (y, 9") = 0. 


In what follows we let H and K be Hilbert spaces. When A is a densely defined 
operator acting from H to K, the Riesz representation theorem may be used to identify 
the adjoint A*, which acts from K* to H*, with a linear operator A* acting from K to 
H. Thus, by definition, an element k € K belongs to D(A”) if there exists a (necessarily 
unique) element h € H such that 


(x\h) = (Axlk), x € D(A), 
and in that case A*k = h. Thus we have the identity 


(x|A*k) = (Ax|k), x € D(A), ke D(A’). 
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Definition 10.21 (Hilbert space adjoint). The operator A* is called the Hilbert space 
adjoint of A. 


Proposition 10.18 admits the following Hilbertian version. We denote by H @ K the 
Hilbert space obtained by endowing the cartesian product H x K with the inner product 


((A,k)|(A',K')) == (Alh’) + (kK). 
Proposition 10.22. If A is a densely defined linear operator from H to K, then A* is 
closed and we have 
G(A*) = (J(G(A))) 


in the sense of orthogonal complements, where J: H®K — K@H is defined by 
J (x,y) := (—y,x). [fA is densely defined and closed, then A* is densely defined. 


Proof In Hilbert spaces the weak topology and the weak* topology agree, and a sub- 
space is weakly dense (respectively, weakly closed) if and only if it is dense (respec- 
tively, closed). Hence everything follows from Proposition 10.18, except the statement 
that G(A*) is the orthogonal complement of /(G(A)) in K @H. This follows from 


(k,h) € G(A*) & (Ax|k) = (x|h) for all x € D(A) 
& (k,h) L (—Ax,x) for all x € D(A) © (kh) L J(G(A)). 


Mutatis mutandis, Proposition 10.19 admits a Hilbertian version as well; we leave 
this as an exercise to the reader. 


Proposition 10.23. If A is a densely defined closed operator acting from H to K, then 
A = A* with equality of domains. 


Proof Wf Z is asubspace of H @ K, then 
(k,h) L J(Z) & J(h,k) € Z+ & (kh) € J(Z*), 


which shows that (J(Z))+ = J(Z+). Using Proposition 10.22 it follows that G(A*) = 
(J(G(A)))~ = J((G(A))*) and 


G(A™) = (J(G(A*)))" = (JI((G(A))*))~ = (G(A))-* = GA). 


For operators acting in Hilbert spaces we have the following extension of Proposition 
4.31, the proof of which is almost verbatim the same: 
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Proposition 10.24. If A is a densely defined closed operator from H into another 
Hilbert space K, then H and K admit orthogonal decompositions 


H=N(A)@R(4*), K =N(A*)@R(A). 
In particular, 


(1) A is injective if and only if A* has dense range; 
(2) A has dense range if and only if A* is injective. 


10.1.c The Spectrum 


The spectrum of a linear operator is defined in the same way as for bounded operators, 
except that explicit attention has to be paid to domains. 


Definition 10.25 (Resolvent and spectrum). The resolvent set of a linear operator A 
acting in X is the set p(A) consisting of all A € C for which the operator AJ — A has a 
two-sided inverse, that is, there exists a bounded operator U on X such that: 


(i) for all x € D(A) we have U(AI—A)x =x; 
(ii) for all x € X we have Ux € D(A) and (AI—A)Ux =x. 
The spectrum of A is defined as the complement of the resolvent set of A: 
o(A) =C\ p(A). 


We emphasise that, although A is allowed to be unbounded, the two-sided inverse 
U =(AI—A)~! is required to be bounded. It is customary to write 


R(A,A):=(A—A)! 
for A € p(A). As in the bounded case the resolvent identity holds: if A, u € p(A), then 
R(A,A) — R(M,A) = (UH A)R(A,A)R(HA). (10.2) 


By the observation in Example 10.8, a linear operator with nonempty resolvent set is 
closed. The proofs of Lemmas 6.7, the holomorphy of the resolvent (contained as part 
of Lemma 6.10), and Propositions 6.12 and 6.17 carry over verbatim, and Proposition 
1.21 carries over with an obvious adaptation of the proof. For the reader’s convenience 
we State the results here: 


Proposition 10.26. If A is closed and satisfies ||Ax|| > C||x|| for some C > 0 and all 
x € D(A), then A is injective and has closed range. 
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Proposition 10.27. The spectrum o(A) is a closed subset of C. More precisely, if A € 
p(A), then B(A;r) C p(A) with r = 1/||R(A,A)||. Moreover, if |A —u| < dr withO< 
6 <1, then 


) 
IRCA) Il < ~—SIRG,A)I- 


Proposition 10.28. The function A > R(A,A) is holomorphic on p(A), and its complex 
derivative is given by —R(A,A)*. 


Proposition 10.29. IfA, > A in C, with each A, € p(A) and with 0 € dp(A), then 


lim ||R(An,A) || = <2. 


n—oo 
By the result of Examples 10.7 and 10.8 a linear operator which is not closed has 
full spectrum o(A) = C. For if we had A € p(A), then the boundedness of R(A,A) = 
(A —A)~! exhibits 2 —A as the inverse of a bounded (hence closed) operator, and hence 
A is closed. 
The following proposition gives a simple but powerful uniqueness result: 


Proposition 10.30. Let A and B be linear operators on a Banach space X. If p(A)M 
p(B) 4 @ and B is an extension of A, then A = B with equality of domains. 


Proof Fix an arbitrary A € p(A)M p(B). Then for all x € X we have R(A,A)x € D(A) C 
D(B) and 

(A —B)R(A,A)x = (A —A)R(A,A)x = x. 
Multiplying both sides from the left with R(A, B) gives R(A,A)x = R(A,B)x. Since x EX 
was arbitrary, we conclude that R(A,A) = R(A,B) and therefore D(A) = D(B). 


The following result is proved in the same way as Propositions 6.18 and 8.9. 
Proposition 10.31. [fA as a densely defined operator in a Banach space X, then 
0(A*) =o0(A). 
IfA as a densely defined operator in a Hilbert space H, then 
o(A*) = (A). 
For later use we compute the spectrum of a simple diagonal operator. 


Proposition 10.32. Let A be a densely defined closed operator in a separable Hilbert 
space H with an orthonormal basis (hn)n>1 of eigenvectors. If p(A) # @ and the corre- 
sponding eigenvalue sequence (An)n>1 satisfies limMy—o |An| = °°, then 


o(A) = {Ani n> 1}. 
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Proof Letu ¢ {A,: n> 1}. The assumption |A,,| — co implies that inf,>1 |u —An| = 
6 > 0, and therefore the mapping 
1 
Ru thn a, 
ab - ie 
has a unique extension to a bounded operator on H of norm at most 1/6. It is clear that 
this operator is injective, so its inverse Kir is closed. Hence the operator B := pb — Re 
with domain D(B) := R(R,,) is closed. Clearly 1 € p(B) and R(u,B) = Ry. Moreover, 


Bhy = bhy — (U—An)hn = Anln =Aln, n> 1. 


We claim that the linear span Y of the vectors h,, n > 1, is dense in D(B) with re- 
spect to the graph norm. Indeed, let g € D(B). Then g € R(Ry), say g = Ryh with 
h=Yj>1¢;h; € H. Let P, denote the orthogonal projection onto the span of fy,..., Mx. 
Then Ag > gin H ask — . Also, hg € R(Ry) = D(B) and 


(BYR = (H—B)ARyh 


my 


j=l Aj 


ahi = Yay + h=R,'g=(u-B)g 


as k — co, This implies BP.g > Bg. It follows that P.g — g in D(B) as claimed. Since 
Y is contained in D(A) and A is closed, it follows that B C A. 

Now let Uo € p(A). Then Lo ¢ {A, : n > 1} since every A, is an eigenvalue for A, and 
therefore Lo € p(B) by what we just proved. Proposition 10.30 now implies A = B. But 
then, by what we already proved for B, every u ¢ {A, : n > 1} belongs to p(A). 


10.2 Unbounded Selfadjoint Operators 


In what follows we let H be a Hilbert space. 


Definition 10.33 (Symmetric and positive operators). A linear operator A acting in H 
is called: 


e symmetric, if for all x,y € D(A) we have (Ax|y) = (x|Ay). 
e positive, if for all x € D(A) we have (Ax|x) > 0. 


Over the complex scalars, positive operators are symmetric. Indeed, if A is positive, 
then for all x € D(A) we have (Ax|x) = (Ax|x) = (x|Ax). By polarisation (as in the 
proof of Proposition 8.1, this requires working over the complex scalars) this implies 
(Ax|y) = (x|Ay) for all x,y € D(A). 

It is an immediate consequence of Definition 10.33 and the definition of A* that if A 
is densely defined and symmetric, then D(A) C D(A*) and Ax = A*x for all x € D(A), 
that is, A* is an extension of A. Since A* is closed, we have shown: 
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Proposition 10.34. [fA is a densely defined symmetric operator, then A is closable and 
A* is a closed extension of A. 


In general, D(A) may be strictly smaller than D(A*). A simple example is the Laplace 
operator A on L?(R@) with domain C?(R®): this operator is densely defined and sym- 
metric but not closed, and therefore A* is a proper extension of A. This motivates the 
following definition. 


Definition 10.35 (Selfadjoint operators). A densely defined operator A in H is called 
selfadjoint if A = A%, that is, if D(A) = D(A*) and Ah = A*h for all h € D(A) = D(A*). 
The operator A is called essentially selfadjoint if it is closable and its closure A is self- 
adjoint. 


Here are some simple examples. 


Example 10.36 (Multipliers). Let (Q,.¥, 1) be a measure space and let m: Q— C be 
a measurable function. It has been shown in Example 10.6 that the linear operator M,, 
in L?(Q) defined by 


D(Mn) = {f €L’(Q): mf € L’(Q)}, 
Mnf :=mf, f<€D(Mn), 


is densely defined and closed. It is immediate from the definition of the Hilbert space 
adjoint that M* = Mj; with equality of domains D(Mj;) = D(M,,). As a consequence, 
Mn is selfadjoint in L?(Q) if and only if m is real-valued -almost everywhere. 


Example 10.37 (Fourier multipliers). Let m: IR? — R bea real-valued measurable func- 
tion and let A,, denote the (possibly unbounded) Fourier multiplier in L7(IR“) defined 
by 

D(Am) :={f € L?(R4) : mf € L?(R4)}, 

Amf := (mf) , fe D(Am). 

Let us prove that Ay, is selfadjoint in L7(IR¢). The symmetry of the multiplier M,, con- 
sidered in the previous example implies that A,, is symmetric: for all f,g € D(Am) we 
have f,g € D(M,,) and, by the Plancherel theorem, 


(Am|8) = (Mmfl) = (f1Mmi) = (f1Am8)- 
By Proposition 10.34 this implies D(A,,) C D(A%,). Conversely, if g € D(A¥,) and A®,g = 


eS 


h, then for f € D(Am) we have, since f € D(Mn), 
(fl) = (f1h) = (Amf lg) = (Amf8) = (Mnf). 


This means that g € D(M*). Hence, by the previous example, g € D(M,,,). This means 
that mg € L?(IR“) and therefore g € D(Am). 


(10.3) 
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Two special cases are of special interest: 


Example 10.38 (The momentum operator). The multiplier m(&) = & gives rise to the 


operator id on L?(IR). With the domain given by (10.3) this operator is selfadjoint. 
With the notation and techniques developed in Section 11.1.e this domain is seen to be 


the Sobolev space H!(R) = W!7(R). 
Example 10.39 (The Laplacian). The multiplier m(&) = —|&|? gives rise to the Laplace 
operator A = pean a on L?(IR@). With the domain given by (10.3) this operator is 


selfadjoint. With the notation and techniques developed in Section 11.1.e this domain is 
seen to be the Sobolev space H?(R¢) = W??(R¢). 


The following version of Theorem 8.11 holds. 


Proposition 10.40. [fA is selfadjoint, then o(A) CR. If, in addition, A is positive, then 
0(A) C [0,°). 


Proof This may be established by repeating parts of the proof of Theorem 8.11, using 
Propositions 10.26 and 10.24 instead of Proposition 1.21 and 4.31, respectively. 


The following proposition provides a sufficient condition for selfadjointness. 


Proposition 10.41. [fA is densely defined and symmetric and p(A) OR ¥ @, then A is 
selfadjoint. 


Proof The nonemptiness of the resolvent set implies that A is closed (cf. Example 
10.8). The operator A* is closed as well. The symmetry of A implies D(A) C D(A‘), and 
if A € p(A) NR, then A € p(A*) in view of Proposition 10.31. The identity A = A* with 
equality of domains therefore follows from Proposition 10.30. 


An efficient proof of the next proposition is obtained by noting that Proposition 10.22 
implies the following criterion for selfadjointness: a densely defined operator A in H is 
selfadjoint if and only if 


where J(x,y) = (—y,x) for x,y € H. 


Proposition 10.42. If the linear operator A is selfadjoint, injective, and has dense 
range, then its inverse A~! with domain D(A~!) = R(A) is selfadjoint. 


Proof From 


(x,y) € G(A7!) & (y,x) € G(A) & J(x,—y) € G(A) 
= (x,—-y) =J(G(A)) = (x,y) =J(G(-A)) 
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we see that G(A~!) = J(G(—A)). Applying J to both sides gives J(G(A~!)) = G(—A). 
Hence, since —A is selfadjoint, by the above criterion 


Applying the criterion once more, this proves that A~! is selfadjoint. 


As a simple application of Proposition 10.42 we record the following result. 


Corollary 10.43. Let A be a densely defined closed positive operator in H. IfI +A has 
dense range, then A is selfadjoint. 


Proof From ||(+A)x||||x|| > ((I+A)x|x) > ||xl|? we see that ||(7+A)x|| > ||x|| for 
all x € D(A). Since A (and hence J+ A) is closed, by Proposition 10.26 this implies 
that J+A is injective and has closed range. Since J+A also has dense range, /+A 
is surjective and the inverse (1 +A)~! is well defined as a linear operator. The bound 
\|(7+A)x|| > ||x|| implies that (7+A)~! is bounded (and in fact contractive). At the 
same time, this bounded operator is positive and therefore selfadjoint. Proposition 10.42 
therefore implies that J+ A, hence also A, is selfadjoint. 


As an application of Corollary 10.43 we have the following sufficient condition for 
selfadjointness. 


Theorem 10.44 (Selfadjointness of A*A). [fA is a densely defined closed operator from 
Hi into another Hilbert space K, then: 


(1) the operator A*A is selfadjoint and positive; 
(2) D(A*A) is dense in D(A) with respect to the graph norm. 


This operator will be revisited in Proposition 12.18 in connection with the theory of 
forms. 


Proof (1): We check that the operator A*A, which is obviously positive, satisfies the 
assumptions of Corollary 10.43. 
By Proposition 10.22 we have the orthogonal decomposition 
H@®K = G(A*) @J(G(A)). 

Hence for any u € K we can find x € D(A) and y € D(A*) such that 

(0,u) = (y,A*y) + J(x,Ax) = (y—Ax,A*y +x). 
It follows that y = Ax, which implies x € D(A*A), and u = A*y+x = (I+A*A)x. This 
proves that J + A*A is surjective. 


(2): To prove density of D(A*A) in D(A) with respect to the graph norm, suppose 
that x € D(A) is such that (x|y)p(4) = 0 for all y € D(A*A), where (x|y) pia) := (xy) + 
(Ax|Ay) is the inner product of D(A), viewed as a Hilbert space with respect to this inner 
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product (completeness being a consequence of the closedness of A; see Proposition 
10.3). Then 


0 = (aly) + (x|A*Ay) = (x (F+A*A)y) 
for all y € D(A*A). Since /+A*A is surjective, this means that (x|z) =0 for all z € D(A), 
sox =0. LJ 


We finish this section with another useful criterion for selfadjointness. 


Theorem 10.45. For a densely defined symmetric operator A in H the following asser- 
tions are equivalent: 


(1) A is selfadjoint; 
(2) A is closed and N(A* +i) = N(A* —i) = {0}; 
(3) R(A+i) =R(A—i) =H. 


Proof (1\)=(2): If A is selfadjoint, then A = A* is closed by Proposition 10.23. If 
x € D(A*) satisfies (A* + i)x = 0, then x € D(A) and Ax = A*x = —ix, and 
—i(x|x) = (Ax|x) = (x|A*x) = i(x]x) 
implies x = 0. In the same way (A* — i)x = 0 implies x = 0. 
(2)=(3): By the same argument as in the proof of Proposition 4.31, the injectivity 
of (A+ i)* = A* $i implies that A+i has dense range (and conversely; this will be 
used in the proof of the next implication). By the same argument as in the proof of 


Theorem 8.11, the symmetry of A implies ||(A + i)x|| > ||x|| for all x € D(A), and since 
A is closed, Proposition 10.26 implies that the ranges of A +7 are closed. We conclude 


that both ranges equal H. 

(3)=(1): Fix an arbitrary h € D(A*). The assumption R(A +7) = R(A* +1) implies 
that there exists an h’ € D(A) such that (A —i)h’ = (A* — i)A. Since A* extends A, we 
have h’ € D(A*) and A*h’ = AN’ It follows that (A* — i)h’ = (A* — i)h. As was noted in 
the proof of the previous implication, the assumptions imply that A* — i is injective and 
therefore h = h’. Since h’ € D(A), this implies that h € D(A) and Ah = A*h, the latter 
since A* extends A. 

This shows that A extends A* Since A* extends A, these operators are equal. 


The theory of selfadjoint operators is taken up again in Section 12.2 in connection 
with the theory of forms. 


10.3. Unbounded Normal Operators 


Having dealt with unbounded selfadjoint operators, we now turn to unbounded normal 
operators. 
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10.3.a Definition and General Properties 
Definition 10.46 (Normal operators). A linear operator A in H is said to be normal if it 
is closed, densely defined, and satisfies 
A*A = AA’. 
The equality A*A = AA* is shorthand for equality of the domains 
D(A*A) := {x € D(A): Ax € D(A*)}, 
D(AA*) := {x € D(A*) : A*x € D(A)}, 
along with equality 
A*Ax = AA*x 
for all x in this common domain. 


Since A** = A by Proposition 10.23, a densely defined closed operator A is normal if 
and only if its adjoint A* is normal. 


Proposition 10.47. [fA is anormal operator, then: 


(1) D(A) = D(A*); 
(2) ||Ax|| = ||A*x|| for all x € D(A) = D(A"); 
(3) ifA C B with B normal, then A = B. 


Proof (1) and (2): The normality of A implies that if x € D(A*A) = D(AA%), then 
x € D(A), x € D(A”), and 


||Ax||? = (A*Ax|x) = (AA*x]x) = ||A*xl 


By Theorem 10.44, D(A*A) is dense in D(A), so for any x € D(A) we may choose a 
sequence x, — x with each x, € D(A*A) = D(AA*) and with convergence in the graph 
norm of D(A). Then x,x, € D(A*), and from 


lim ||A*x,—A*Xm|| = lim  ||Ax, —Axm|| =0 
n,m— co n,m— co 


we infer that A*x, — y for some y € H. From the closedness of A* (Proposition 10.23) 
we infer that x € D(A*) and A*x = y. This argument shows that D(A) C D(A%*). 

Since A* is normal, what we just proved can be applied to A* This, together with 
Proposition 10.23, gives the reverse inclusion D(A*) C D(A**) = D(A). 

(3): If A C B with A and B normal, then by (1), Proposition 10.19(1), and another 
application of (1), 


D(B) = D(B*) C D(A*) = D(A). 


Together with the assumption D(A) C D(B) this implies D(A) = D(B). 
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10.3.b The Measurable Functional Calculus 


Projection-valued measures give rise to normal operators: 


Theorem 10.48 (Measurable functional calculus). Let (Q,.¥) be a measurable space, 
let P: ¥ + L(H) be a projection-valued measure, and let f :Q— C be a measurable 
function. There exists a unique normal operator ®(f) in H satisfying 


Dim) ={xeH: | /Par <=}, 
(@(fxls) = [ far, xe D((f)). 
For all x € D(®(f)) we have 


lecnal? = f iran. (10.4) 
Furthermore, if fn, f,g : Q— C are measurable functions, then: 


(1) P(f)P(g) C (fg) with D@(f)P(g)) = D(P(Fg)) ND(P(g)); 

(2) Bf)" = (Ff); 

(3) if0 < |fnl < |f| and limy +. fn = f pointwise on Q, then D(®(f)) C D(®(f;)) and 
lim P(f,)x = P(f)x, x € D(P(f)). 


n—-0o 


The operator ®(f) is selfadjoint if and only if f is real-valued P,-almost everywhere for 
allx € H. 


It follows from (1) that 


P(f)P(g) = P( fg) = D(P(fg)) C D(P(g)). (10.5) 


This is trivially the case if g is bounded, for then D(®(g)) = H. In that case P(g) is 
bounded and equals the operator given by the bounded calculus of Theorem 9.8. This 
fact, and the properties of the bounded calculus, will be frequently used in the proof 
below. 


Example 10.49. Under the above assumptions, it follows from (10.5) that 


®(f") = (®(f))",  n=1,2,... 


To prove this, proceeding by induction it suffices to check that &( f*+!) = &(f*)&(f) 
for all k= 1,2,... By (10.5), this operator identity holds if and only if D(®(f**!)) C 
D(®(f)). If x € D(@(f**")), then fo | f|7*+? dP, < 00. Since P, is a finite measure, this 
implies fo |f|? dP. < ©, that is, x € D(®(f)). 
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Proof For the moment define D; to be the set {x € H: Jo |f|? dP; < ox}. 

If x,y € Dy and g = a cjlg; isa simple function satisfying 0 < g < |f/°, with c; >0 
and disjoint sets B; € ¥, then by the Cauchy—Schwarz inequality for the sesquilinear 
forms (x,y) + (Pg}xly), 


k 
iE gdPry = Y cj(Pp, (x+y) x+y) 
j=l 
k 
< ¥ cj ((Ps,xlx) + 2(Pp,x|x)!/?(Ps,yly)!/? + (Ps,yly)) 


j=l 
k 
<2¥ ey ( (Pass) + Pal) 


=2/ ear, +2 f gar, <2 / far, +2 f [fl dP,. 
Q Q Q Q 


Taking the supremum over all such simple functions g, we obtain 


[lsPePay <2 f ipPar.+2 f ifPan,. 
Q Q Q 


This shows that D+ is closed under addition. The identity 


[lara =le? f ifPar, 
Q Q 


is evident for simple functions f, follows for general functions f by approximation, and 
shows that Dy is also closed under scalar multiplication. It follows that Dy is a linear 
subspace of H. 

To prove that Dy is dense in H fix an ape x € A and forn=1,2,... let By := 
{|f| <n}. Then for all x € R(Pg,,) and B € F, 


[jue dP, = (Ppx|x) = (PpPp,x|x) = (Pane, x|x) ={ 1p dP,, 
so by linearity and monotone convergence, 
[ifPar= ff Uf ars <n? P.(By) <n?P(Q) = fal 


This implies that R(Pg,,) is contained in Dy. Since Q = U,,5 Bn, monotone convergence 
implies that ||Pz,x||* = (Ps,x|x) = fo 1s, dP > fo 1dP; = ||x||? and therefore 


|x — Ps, xl] = |loll? — 2(Pa, 1x) + [lPs,-xl|” 0 


as n — oo, This proves that x belongs to the closure of Dr. 
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For simple functions g = ey cjlp, we set 


k 
P(g) = Ss cjPp;- 
j=l 


It is routine to check that this is well defined and that (10.4) holds for g. If x € Dy, then 
f EL? (Q,P,). If gn > f in L?(Q, P,) with each g, simple, then 


|®(en)x-(gn)xl? = ff len —sml? AP, > 0 ais Hin 88. 


Consequently for x € Dr we may define 


®(f)x := lim ®(g,)x. 
noo 

This is well defined, and the validity of (10.4) for g, implies the validity of (10.4) for f. 

In this way we obtain a well defined linear operator ®(f) : Dy + H. In what follows 
we view ®(f) as a linear operator in H with domain D(®(f)) = Dy. The closedness 
of &(f) follows from (2) applied to f and Proposition 10.18. Normality of ®(f) is an 
easy consequence of (1) applied with g = f, noting that D(®(|f|*)) C D(®(f)) follows 
from Hélder’s inequality or noting that fo |f|>dPy < fo 1+|fltdPy. 

By (2), ®(f) is selfadjoint if and only if 6(f)* = B(f), and by (10.4) applied to 
f —f, this holds if and only if f is real-valued P,-almost everywhere for all x € H. 


(3): By (10.4) applied to f — fy, for x € D(®(f)) we have x € D(®(f — f,)) and 
lim | @(/)x—(f,)| = lim [| |f— ful? AP =0 


by dominated convergence, so lim)... P(f,)x = B(f)x. 


In what follows, for n = 1,2,... let 


In= Ul picny- 
(1): First let f be bounded and measurable and g be measurable. For all x € D(®(g)) 


we have x € D(®(fg)), and using (3), the boundedness of ®(f), and the multiplicativity 
of the Borel calculus for bounded normal operators, 


(f)P(g)x = lim (f)H(gq)x = lim O(fgn)x = OC Fa)s. 


n—oo 


Hence by (10.4) , 
[lsP oro = f iter ars. 
Q Q 


This being true for all bounded measurable functions f{, by monotone convergence it is 
true for all measurable functions f. Hence, if f and g are measurable, we infer that for 
elements x € D(®(g)) we have ®(g)x € D(®(f)) if and only if x € D(®(fg)). This is 
the same as saying that (1) holds. 
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(2): Let x € D(®(f)) and y € D(®(f)) = D(®(f)). Then, by (3) and the properties 
of the Borel calculus for bounded normal operators, 
(®(f)x{y) = lim (®(f,)/y) = lim (a|®(F,)y) = (x|©(P)9). 


This shows that y € D(®(f)*) and ®(f)*y = ®(f)y. We have thus proved the inclusion 


®(f) C B(f)* For the converse inclusion let y € D(®(f)*). We wish to prove that 
y € D(®(f)) = D(®(f)), that is, that fo | f|? dP, < °°. 
Let z:= ®(f)*y. We claim that 


D(L 4 f)<np )Z = P(fn)y- (10.6) 
It follows from (1), applied with g = 1, fi<n}> that for all x € H we have 
Dp) s.<mj)x € D(P(f)) and B(f)P1y cn} )x = Pn) x. 
Then, for all x € H, 
(x1 ®(Lg p<ny)Z) = (PA gp <np)al@(f)Y) 
= (P(f) P(A pcm} aly) = (@Ufn)xly) = (21®(fr)y), 


using the conjugation property of the Borel calculus for bounded normal operators in 
the last step. This proves the claim (10.6). 
By (10.4) and (10.6), 


[lie ae, = Gavi = Agen 2? = f tem AP: 


and therefore 


2 : 2 3 2 
[lf op = jim f ifalde = Jim f Agijem AP < f 1dr. = [el 


so that y € D(®(f)). This completes the proof of the identity ®(f) = &(f)* 


The following substitution rule extends Proposition 9.9. 


Proposition 10.50. Let (Q,-¥) and (Q', #') be measurable spaces and let f :Q— Q! 
be a measurable mapping. If P: F¥ — LH) is a projection-valued measure, then the 
mapping Q: ¥' + £(H) defined by 


Op :=Pr-g), BEF, 


is a projection-valued measure. Denoting by ® and ¥ the measurable functional calculi 
of P and Q, for all measurable functions g : Q! + C we have 


(go f) = P(g) 


with equality of domains. 
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Proof The proof is the same as that of Proposition 9.9, except that some domain issues 
have to be taken care of. Following this proof, for all nonnegative measurable functions 
g on Q! and x € X we obtain 


[setar= [840 (10.7) 


the finiteness of one of these integrals implying the finiteness of the other. Applying 
this with g replaced by |g|*, it follows that x € D(®(go f)) if and only if x € D(¥(g)). 
For all x in this common domain, (10.7) can be rewritten as (B(go f)x|x) = (W(g)x|x), 
and by polarisation this implies that for all x,y in this common domain we have (®(go 
f)xly) = (¥(g)x|y). The operator ¥(g), being normal, is densely defined and therefore 
this identity holds for all y €¢ H. The result now follows. 
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The proof of the spectral theorem for unbounded normal operators proceeds by a reduc- 
tion to the bounded case. The basic idea is to exploit the fact that the mapping 

Zz 
(Lela)? yt? 
maps the complex plane bijectively onto the open unit disc D. This suggests that if A is 
a normal operator, then 


CZK 


Za = A(I+A*A)~1/? 


is anormal contraction on H. This is indeed the case, as will be proved in Proposition 
10.53. It follows that o(Z,4) C D. By the spectral theorem for bounded normal operators, 
there exists a projection-valued measure Q on D such that 


Za= | AdQ(A). 


We now define a projection-valued measure P on C by setting Pg := Q¢ gy for Borel 
sets B C C, and use Proposition 10.50 to show that 


A= f A4P(). 


In the same way, the uniqueness of P for representing A is reduced to the uniqueness of 
Q for representing Z,. 

Some technical details need to be addressed to turn this simple idea into a rigorous 
proof: one has to deal with subtle domain issues and with the fact that € maps C onto 
the open unit disc, whereas Q is supported on the closed unit disc. 


We start with the proof that Z, is well defined as a contractive normal operator on H. 
This is accomplished in Proposition 10.53, for which we need two lemmas. 
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Lemma 10.51. Let A be a closed operator in H, let T © @(H), and let f € C(o(T)). 
Then: 


(1) if T is selfadjoint and TA C AT, then f(T)A CAf(T); 
(2) if T is normal and TA C AT and T*A C AT*,, then f(T)A CAf(T). 


Proof We only prove the second assertion, the first being an immediate consequence. 

We have T7A = T(TA) C T(AT) = (TA)T C (AT)T = AT. Continuing by induction 
we see that T*A C AT* for all k € N. In the same way it is seen that (T*)kA C AT (*)* 
for all k € N. These inclusions imply that 


p(T,T*)A C Ap(T,T*) 


for all polynomials p in the variables z and Z; notation is as in Section 8.2.b. By the 
Stone—Weierstrass theorem there exist polynomials p, in the variables z and z such 
that p,(z,z) + f(z) uniformly with respect to z € o(T). Then, by the properties of the 
continuous functional calculus for normal operators (Theorem 8.22), 


I|Pn(T,T*) — f(T) || = sup |pn(z,2)— f(z)| 4 0 as no. 
z€o(T) 


The inclusions p,(T)A C Ap,(T) imply that if x € D(A), then p,(T)x € D(A) and 


lim Ap, (T)x = lim p,(T)Ax = f(T)Ax. 
noo 


noo 


Since also limy5.. Pn(T)x = f(T )x, the closedness of A implies that f(7)x € D(A) and 
Af (T)x = f(T )Ax. This gives the result. 


We have seen in Theorem 10.44 that if A is a densely defined closed operator in H, 
then A*A is selfadjoint, and by Proposition 10.40 we have o(A*A) C [0,°0). This allows 
us to define 


Tea (Ara, 
This operator is bounded and positive, and if A is normal we have T, = Ty+. 
Lemma 10.52. [fA is normal, then for all x € D(A) we have Tyx € D(A) and 
ATyx = Ty Ax. 


Proof Letx€ D(A). Then y:= Tyx € D(A*A) C D(A), Ay = ATyx € D(A*), and A*Ay = 
x—Tyx € D(A), so Ay € D(AA*) = D(A*A). Combining this with (J+ AA*)A =A+ 
(AA*)A = A+A(A*A) = A(I+A%*A), it follows that 


AT,x = [Ty UI +AA*)|ATyx = TsA[(I +A*A)Ty]x = TyAx. 


Proposition 10.53. [fA is a normal operator, then: 
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(1) the range of T,!” is densely contained in D(A); 
(2) the operator Z, := AT, [2 is contractive and we have Ty = I — Z4Z,; 
(3) Z, is normal and Zi = Zax. 


Proof (1) and (2): We have R(Z4) C D(A*A) € D(A) and therefore the operator AT, is 
well defined on all of H. As a composition of a bounded operator and a closed operator, 
it is closed and therefore bounded by the closed graph theorem. The operator Ty is 
bounded and positive, and the injectivity of J, implies the injectivity of its square root 


a" z By selfadjointness and Proposition 4.31, this square root has dense range. For 
hie x in this range we have T, Ws y = Tyx € D(A*A) C D(A), soy € D(Z4) := {he 
oa ihe D(A)} and 


||Zay||? = ||ATx|/? = (A*ATax|Tax) = (x|Tax) — (Tax|Tax) < (2|Tax) = Oly) = II’: 
1/2 


Since the range of T,’~ is dense, so is D(Z,) and therefore, with respect to the norm 
of H, Z, is contractive from its dense domain into H. The operator Z, is also closed, 
for if y, + y in H with y, € D(Z,4) and Zayy = ATL yy — y’' in H, the closedness of 
A implies rye € D(A) and ATI? y = y’; but then y € D(Z,) and Zay = y’. Thus Za 
is closed, densely defined, and contractive with respect to the norm of H. This forces 
D(Za) =H. 

We have already shown that the range of T, /? is contained in D(A). To see that 
this inclusion is dense with respect to the graph norm it suffices to note that R(74) = 
D(A*A) is dense in D(A) with respect to the graph norm of D(A) by Theorem 10.44. The 
inclusions R(T4) C R(T, zt a) C D(A) therefore imply that the inclusion R(T, ut she D(A) 
is dense with respect to the graph norm of D(A). 

By Lemma 10.52, for x € D(A) we have T4x € D(A) and ATyx = TyAx, so T,A C ATy, 
and then Lemma 10.51 implies 


TiPACAT (10.8) 
Also, for x € D(A), 
(ZiZaxlx) = (ATP XAT, x) = (Ty Axl Ty/Ax) 
= (TsAx|Ax) = (AT4x|Ax) = (A*ATax|x) = (1 — Ta) x|x). 


Since both J, and Z, are bounded, the identity (Z{Z4x|x) = ((/ — T4)x|x) extends to 
arbitrary x € H. This implies the operator identity Ty, = I] — Z4Za. 

(3): Since A is normal we have T4x = Ty. Since this operator is selfadjoint and A* is 
normal, it follows that 


Zax A*T. a ACT ae At(ri/y* © (7) 
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where both (i) and (ii) follow from Proposition 10.19 and (10.8). Both Z4 and Z,4+ are 
bounded (the latter by Proposition 10.53 applied to A*), and therefore 


Zax = Zh. 
Normality of Z4 now follows from 
(ZAZaxlx) = (Zax|Zax) = (AT, ?x|ATI/?x) = (A*AT 2 x|Ty 2x) 
and 


(ZaZixlx) = (Zhx|Zhx) = (Zavx|Zanx) = (AA*T el? x|Tal7x), 


observing that the two right-hand sides are equal since A is normal. 


Now we are ready for stating and proving the main result of this section. 


Theorem 10.54 (Spectral theorem for normal operators). For every normal operator A 
there exists a unique projection-valued measure P on 0(A) such that 


A= | AdP(A) 
o(A) 


ce ne 
(I+keP)? 


which maps the complex plane bijectively onto the open unit disc D, with inverse 


Proof Consider the mapping 
(10.9) 


—1 WwW 
:wRy ———___.. 
oe Oey 
Define the projection-valued measure P on C by 


Pp:=Qep), BE AC), 


where Q is the projection-valued measure of the normal contraction Z4, which is sup- 
ported on o(Z,). Since Z, is contractive, o(Z,) is contained in D. It will be convenient 
to think of Q as supported on D. The proof that P has the desired properties and is 
unique is carried out in several steps. 

In what follows we let ® and ‘¥ denote the measurable functional calculi of P and Q. 


Step 1 — Let id(A) := A. We begin by proving the inclusion 
R(T,'") C D(®(id)), 


where ®(id) = fo iddP. 
Let p € C.(C) satisfy 0 < p < 1 pointwise. Using Proposition 10.50, the fact that 
o€~! has compact support in D, and the fact that Q is supported on o(Z,) C D, 


[e@leParc) = [ oe aylgs"@yPaa.a) 
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- pias 
= [PS TAMIE AIPdO(A) = (6Za)x10), 
where @ € C(o(Z,)) is the function 
OA) =eS ANIC AP =e AYAPG- (AP) 
Suppose now that x € R(T,’ ), say 
x= Ty? y = (1—|Za)?)!/2y (10.10) 
for some y € H; this follows from Ty = I — Z{Z4 = I — |Z,|?. Then, 
[ele Pa) = (@Za)als) = (eG Za))Izal?ybv) = llo'*(S-! Za) Zan 
< [A+ pC) lellZayll? < Zayl? = Axl? 


keeping in mind that x € R(T, ii 2 C D(A). Applying this to a sequence p, € Co(C) 
satisfying 0 < p, tT 1 pointwise as n — o, by monotone convergence we obtain 


ih lid|? dP, =f Iz!" dP, (z) < ||Ax||? <=. 
Cc Cc 
This proves that x € D(®(id)). 


Step 2 — We now prove that for x = T, Thy € R(T,’”) we have 


[20Ra) ~ (Ax|x). 
Repeating the reasoning in Step 1 with p € C.(C) as before, with 
O(A) = p(S(A))O MA) = p(S (a) aL AP)? 
we obtain 


[,p)z4P.le) = @(Zadsx) = (0(E- (Z,))Za (I —|Zal?)'/? 919). 


Applying this to a sequence p, € Cc(C) satisfying 0 < py, f 1 pointwise as n > ©, by 
dominated convergence, the convergence property of the bounded functional calculus, 
and (10.10), we obtain 


[aPs(e) = im [ pu(2)zdPs(2) = fim (pu(S-' Za) )Za(— ZizZa)"?y)) 
= (Za(I— Z4Za)'yly) = (aT, y1y) = (Anta. 


Step 3 — Since both A and ®(id) are closed, and since R(T; / *) is dense in D(A) by 
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Proposition 10.53, the result of Step 2 implies that A C B(id). Since both operators are 
normal, the identity A = ®(id) = {- A dP(A) follows from Proposition 10.47. 


Step 4 — It remains to prove uniqueness of P. We will do so by reducing matters to 


the uniqueness of Q. 
Mess i AdP(A) 
o(A) 


Suppose that 
for a projection-valued measure P on o(A). Let ® denote the measurable calculus as- 
sociated with P. We have ®(id) = A and ®(id) = A* by Theorem 10.48(2). By the 
multiplicativity, 


P(1)ia)<n lid|*) = P(1 ia) <ny id) ® (1,\ia)<nyid). 


Taking limits n —> oo using Theorem 10.48(3), for x € D(A*A) we have x € D(®(|id|)) 
and 


®(\id|?)x = B(id)B(id)x = A*Ax. 
Similar arguments show that 
((1 + lid|?)~!)x = (@(1 4 |id]?))~!x = (I+ A*A)~ 1x = Tax. 


Since D(A*A) is dense, this identity extends to arbitrary x € H. Then, as in Step 1 of the 
proof of Theorem 10.54, the multiplicativity for the measurable calculus for bounded 
selfadjoint operators and the uniqueness of positive square roots gives 


@((1 + jid\?)—"/2) = 7)”. 
Hence, 
@(6)x = B(id(1 + |id|?)~"/7)x = ATI? x = Zax. (10.11) 


Since D(A*A) is dense in H, this identity extends to arbitrary x € H. 

Consider the projection-valued measure QO on D given by Op := Pea )> Where G: 
C > D is the bijection of (10.9). In what follows we view ¢ as a measurable mapping 
from C to D. With p € C,(C) as before, by (10.11) and Proposition 10.50 we have 


(B(p)Zax|x) = (B(p)¥(E)x|x) = (B(pf)x|x) 
= Jim [ ee lim [ cn all 


n—-o0o noo 


= [oc ~"(u))udOx(u = foe L))udO,(u). 


Applying this to a sequence p, € C.(C) satisfying 0 < p, + 1 pointwise as n > ©, by the 
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convergence property of the bounded functional calculus and dominated convergence 
we obtain 


(Zaxix) = lim (©(p, Zalx) = Jim | py(S-(u))wdOx(u) = ff wadOx(u). 
noo noo Jip D 
By Proposition 9.12, the support of Q is contained in o(Z,), and therefore we have 


Z= [ uddu(u). 


This shows that Q is the projection-valued measure of Z4. Now Proposition 9.13 implies 
that OQ = Q and hence P = P. 


For normal operators A and measurable functions f : o(A) — C, the operator ¥(f 
defined in terms of the projection-valued measure of A by the calculus of Theorem 10.48 
will be denoted by f(A): 


fay=ein= aft 


In the same way as for bounded normal operators in Theorem 9.19, the properties of 
the bounded calculus ¥ translate into corresponding properties for the mapping f > 
f(A). The result of Example 10.49 says that for any normal operator A and measurable 
function f : o(A) > C, 

(f")(A) = (f(A))", n=1,2,... 

If P is a projection-valued measure on a measurable space (Q,.F) and f :Q— C is 
measurable, the P-essential range of f is the set Rp(f) of all z € C such that Pg, #0 
for all r > 0, where 

Ez, :={@ EQ: |f(@)—z| <r}. 


It is easy to see that Rp(f) is a closed set contained in f(Q). 


Theorem 10.55 (Spectral mapping theorem). Let A be normal with projection-valued 
measure P, and let f : o(A) > C be measurable. Then 


o(f(A)) = Re(f) S f(o(A)). 
If f is continuous, then 
o(f(A)) = f(olA)). 


Proof Let z € CRp(f). Since Rp(f) is closed, the function g,: A+ (z— f(A))~! is 
well defined P-almost everywhere and bounded on o(A), and therefore the operator 
g-(A) is bounded. Moreover, (z— f(A))g-(A) = g-(A)(z— f(A)) =A by Theorem 10.48 
and the boundedness of g,. It follows that g,(A) is a two-sided inverse for z— f(A). This 
proves the inclusion o(f(A)) C Rp(f). 
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Suppose next that z € Rp(f). Since P is supported on o(A), for n = 1,2,... the 
orthogonal projections Pg, , ,, are nonzero, where E, 1, = {A € o(A) : |f(A) —2| < 1}. 
In particular, E, 1 /, # @, and this implies that d(z, f(o(A)) < 1. This being true for all 
n > 1, it follows that z € f(o(A)). 

If f is continuous and p € C is such that f(u) ¢ o(f(A)) = Re(/f), then for € > 0 
small enough the relatively open set 


N:= {A € ofA): |f(A)—f(u)| < €} 


satisfies Py = 0. The continuity of f implies that B(u;6)Mo(A) C N for some small 
enough 6 > 0. But then w ¢ Rp(id) = o(id(A)) = o(A). This proves the inclusion 
f(o(A)) C o(f(A)), which self-improves to f(o(A)) C o(f(A)) since o(f(A)) is 
closed. 


The following result gives necessary and sufficient condition for the presence of 
eigenvalues for the operators f(A). 


Theorem 10.56 (Eigenvalues). Let A be anormal operator with projection-valued mea- 
sure P, and let f : o(A) + C be measurable. For uw € C let Nf(u) = {A € ofA): 
f(A) =u}. The following assertions are equivalent: 


(1) w is an eigenvalue of f(A); 
(2) Pyjiuy 0: 


In this situation, Py (LL) is the orthogonal projection onto the corresponding eigenspace, 
and for a vector x € H the following assertions are equivalent: 


(3) x € D(f(A)) and f(A)x = px; 
(4) Prve(u)* =x. 


Proof Upon replacing f by f — 1 we may assume that pp = 0. Set Nr := N¢(0) for 
brevity. 

If x € D(f(A)) satisfies f(A)x = 0, then f(A) = 0 for P,-almost all A € o(A) by 
(10.4). This is equivalent to saying that P,-almost every point of o(A) is contained in N,, 
that is, P(o(A) \ Nr) = 0. This, in turn, is equivalent to saying that P5(4)\v,x = 0, that 
is, x— Py [pX = 0. Conversely, if Py (,X =X, OF equivalently, if f = 0 P,-almost everywhere 
on o (A), then x € D(®(f)) = D(f(A)) by the definition of D(®(f)) in Theorem 10.48. 

This proves the equivalence (3)<(4). This equivalence also establishes that Pn; is the 
orthogonal projection onto the eigenspace {x € D(f(A)) : f(A)x = 0}. It further shows 
that if 0 is an eigenvalue of f(A), with eigenvector x € D(f(A)), then 


Pyyx = Po(ayX — Poca), = Po(ay* = llall? #0 


since x 4 0. This proves the implication (1)=(2). If (2) holds, there exists a nonzero 
x € H with Py rp(m)x =X and (1) follows from the implication (4)=(3). 
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Corollary 10.57. Let A be a normal operator and let f : (A) + C be measurable. If 
x € D(A) satisfies Ax = Ax, then x € D(f(A)) and f(A)x = f(A)x. 


As in the bounded case, we can use the functional calculus to define square roots: 


Proposition 10.58. [fA is a positive selfadjoint operator, then A admits a unique posi- 
tive selfadjoint square root Al/2, 


Proof The operator f(A) with f(A) = 1/2 is selfadjoint and positive, and squares to 
A by the result of Example 10.49. This proves existence. 

To prove uniqueness, suppose that B is a positive selfadjoint operator satisfying B? = 
A. Let P and Q be the projection-valued measures of A and B; both are supported on 
[0,cc) since A and B are selfadjoint and positive. Let R be the projection-valued measure 
on [0,c°) defined by Rc = Q~2 for Borel sets C C [0,c¢). By Theorem 10.50 and the 
result of Example 10.49, 


i, AR = | VdQ=B-=A = AAP. 
(0,20) [0,.0) [0,c0) 


It follows that both R and P are projection-valued measures representing A. By the 
uniqueness part of Theorem 10.54 we therefore have R = P. But then 


Al? — he, AN? gp = es | AdO=B. 
0,0) 


If A is normal, we may use the measurable calculus to define |A| := f(A), where 
f(A) = |A|. Furthermore, A*A is selfadjoint and positive, so it has a unique selfad- 
joint and positive square root (A*A) '/2 by Proposition 10.58. The next corollary extends 
Corollary 8.29 to unbounded normal operators: 


Corollary 10.59. For every normal operator A we have D(|A|) = D(A) and 
(A*A)!/2 = JA], 


Proof With id(A) = A we have D(|A|) = D(®(|id|)) = D(®(id)) = D(A), the middle 
identity being immediate from the definition of these domains. Applying the result of 
Example 10.49 twice we obtain, with f(A) = |A|, 


IAP? = f(A) F(A) = f(A) = (oid) (A) = d(A)id(A) = A*A, 


with justification of the domain equalities as in the preceding proof. The identity |A| = 
(|A|?)!/? now follows by taking positive square roots using Proposition 10.58. 


We proceed with some examples. 
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Example 10.60 (Multiplication operators). Let (Q,.%,u) be a measure space and let 
m: Q—» C be a measurable function. The linear operator A,, defined by 


D(Am) = {f €L7(Q,p) : mf €L?(Q,y)}, 
Anf:=mf, f€D(Am), 


is normal, its spectrum o(A,,) equals the p-essential range of m, and its projection- 
valued measure is given by 


Paf =1),-1(aE)f 
for B€ B(o(Am)) and f € L’(Q,y). 
Example 10.61 (Fourier multiplication operators). Let m : R4 — C be a measurable 
function. The linear operator 7, defined by 
D(In) = {f € L*(R*): FF € D(Am)}, 
Inf = F~'AmF f,  f € DT); 


where A, is the operator of the preceding example, is normal, its spectrum equals 
0(Tn) = 0(Am), and its projection-valued measure is given by 


Paf =F yey Ff) = 7, 


(B) 


for B € B(o(Tm)) and f € L?(IR“). Thus, the projections in the range of the projection- 
valued measure of the Fourier multiplier operator 7,, are Fourier multiplier operators 
themselves. 


To conclude we complete the proof of Theorem 9.28. 


Proof of Theorem 9.28, the inclusion ‘C’ We split the proof into four steps. 
Step 1 — Let (hn)n>1 be a dense sequence in H and set 


n—-1 
Sn =In— Yo mn, n2l, 
k=l 
where 7% is the orthogonal projection onto the closed linear span Hy of the sequence 
(T/ x) jen. We further set 7 := 0. 

We claim that Ta = ™%T for all k € N. For k = 0 this is trivial and for k > 1 this 
follows from the following argument. If h € R(m) = Hy, then Th € Hy, = R(a) and 
therefore Tah = Th = mTh. If h L R(ay), then h L T/g, for all 7 € N and then 
(Th\T/ 9.) = (h| T+! gx) = 0 by selfadjointness, so Th L T/g, for all j € N and conse- 
quently Th L R(7,); this implies Tah = 0 = ™%Th. 

We claim that 


HjM =™mA;=0, j,kEN, Fk. (10.12) 


342 The Spectral Theorem for Unbounded Normal Operators 


It is clear that (10.12) holds for j = 0 and k = 1. Suppose (10.12) has been proved for 
allindicesO < j#Ak<n—1.If j <n, then 
n—1 
1 jSn = Wjhn — Hj Y Myhtn = Tjhn — Thy = 0 
k=1 


by the induction hypothesis. In particular, 
mjT* gn =T*Rjgn=0, KEN, 


from which it follows that H,, = R(%,) is contained in N(z;), so H, Hj. This implies 
that 7;%, = 1,1; = 0, completing the induction step. 

Next we claim that 7h := Yj; 1jh =/h for all h € H. By density it suffices to prove 
this for the vectors h = h,, from the dense sequence. For these vectors, 


n-1 n—1 
Thy, = TSn + » Thy = 8nt » Thin = hn, 
k=1 k=1 


where we used the orthogonality of the projections 7, to see that 17gy) = Mn = ny and 
UM, = My. 

Choose a sequence of numbers c, > 0 such that the sum )°,,51 Cnn converges; for in- 
stance, one could choose c, = 2~"(||gn|| + 1)~”. Denote its sums by x. Let H, denote the 
closed linear span of the sequence (T*x),cn and let 2, denote the orthogonal projection 
onto H,. By the same argument as before the operators 7, and T commute. 

Step 2—Let S € {T}" be arbitrary and fixed. Since 2, € {T}’ it follows that Sa, = 7,8. 
As a result, Sx € H,. Since H, is the closed linear span of the vectors Tix, JEN, it 
follows that there exist polynomials p, such that p,,(T)x — Sx. Then, 

Ipa(T)x—Pm(T)xI? = [Ip — Pol? APa, 
o(T) 
where P is the projection-valued measure of T. Since the left-hand side tends to 0 as 
m,n —> 0, the sequence (py)n>1 is Cauchy in L?(o(T),P,). Let f € L?(o(T), P,) be its 
limit and pick any pointwise defined measurable function representing it. With slight 
abuse of notation, this representative is denoted by f again. 


Step 3 — Relative to the projection-valued measure P, by the measurable calculus of 
Theorem 10.48 the operator f(T) := ®(f) with domain 


D(f(7))={heH: f [Par <->} 


is well defined and selfadjoint. To complete the proof we show that D(f(7)) =H, f(T) 
is bounded (hence a posteriori f can be taken to be bounded), and S = f(T). 
We begin by noting that x € D(f(T)) and 


S. =) dP, = lim n dP, = lim p,(T)x = f(T )x. 
v= [fae = tim |) padPe= Him pal Tx = F(T): 
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If V © Y(H) commutes with T, then V € {7}, so V commutes with S. We claim that 
Vx € D(f(T)) and 


f(T)Vx =V f(T )x = VSx. 
Indeed, first we note that 
[|e Pl? dPs = Il(Pn(T) = PTV? 
o(T) 
= ||V(pn(T) — pm(T))x|7 > ||V(S—S)x||? = 0. 


Since p, — f in L?(o(T),P,), after passing to a subsequence if necessary, we may 
assume that p, — f pointwise outside a Borel set N satisfying P,(N) = 0. But, since V 
commutes with Py, 


PrelN)1=| f Iva] = (PV 1V2)] = [Pval*V2) 


pares 
< ||Prxl||[V*Vx|| = (Pyexlx)? ||V*Vx|| = 0. 
This implies that also p, — f pointwise Py,-almost everywhere. This allows us to in- 
voke the dominated convergence theorem to conclude that for every N > 1, 


i, IP ANAR, = lim f [Pn|? AN dPyx 
o(T) ne Jo(T) 


< limsup ms Pal dP 
oO 


n—-0o 


= limsup I|pn(T)Vx||7 


n—- 00 


=> lim sup IV pa(T)x||? 
n— oo 


<|V|Pimsup|ipa(7)x?=[IVIP f [fP dea. 
n—-oo o(T) 
Letting N — and applying the monotone convergence theorem, this proves that Vx € 
D(f(T)). The required identity then follows from 
f(T)Vx = lim Di(T)Vx=V lim Dill )x =V f(T )x = VSx = SVx. 
no n-oo 
Step 4 — We apply the result of Step 3 to the bounded operator Vinny = Cy! PrT tn, 
where P, := Pr picn}- In view of Minx = Ln>1 CnAm8n = Cm8m we have 
FORM eae (DBT tk = FT Wink 
= SVinut en SRE ee SPT ans 


It follows that f(7)P,h = SP,h for all h in the linear span of the sequence (F* &m)keNs 
which is dense in H,,. Since the spaces H,,, m > 1, span a dense subspace of H this 
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implies that there is a dense subspace Y of H such that f(T)P,h = SP,h for all n > 1 
and h € Y. Since f(T) is closed and f(T )P, is bounded, this identity extends to arbitrary 
hed. 

Fix an arbitrary h € D(f(T)). Then P,h > hand f(T)P,h =SP,h — Sh, so the closed- 
ness of f(T) implies that f(T)h = Sh. Since S is bounded and f(T) is closed, this forces 


D(f(T)) =H and f(T) =S. 


10.1 
10.2 


10.3 


10.4 


10.5 


10.6 


10.7 


Problems 


Prove Propositions 10.26—10.29. 


Let A be a densely defined operator in a Banach space X which is bounded with 
respect to the norm of X, that is, there is a constant C > 0 such that ||Ax|| < C||x|| 
for all x € D(A). Prove that A is closable, D(A) = X, and A is bounded with 
\|Ax|| < C]|x|] for all x € X. 

Let A be a densely defined closed operator in a Banach space X and suppose 
there is a subspace Y, contained in D(A) and dense in X, such that Ay = 0 for all 
y €Y. Does it follow that Ax = 0 for all x € D(A)? What happens if the closedness 
assumption is dropped? 


Show that if A and B are linear operators in a complex Hilbert space such that 
D(A) = D(B) and 
(Ax|x) = (Bx|x), x € D(A) = D(B), 


then A = B. 
Define the linear operator A in L7(0,1) by 


D(A) :=C[0, 1], 
Af:=f(0)1, f D(A). 


Show that A is densely defined but nonclosable. 


Let A be any nonclosable operator in a Hilbert space H. Show that the operator B 
on the Hilbert space direct sum H @ H defined by 


D(B) := D(A) @ {0}, 
B(x,0):=(0,Ax), (x,0) € D(B), 


is symmetric and nonclosable. This example shows that the densely definedness 
assumption cannot be omitted from Proposition 10.34. 


Provide the details to Example 10.7. 
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10.8 Let M be an arbitrary nonempty closed subset of C. In the Banach space Cy(M) 
of bounded continuous functions on M consider the linear operator A given by 


D(A) :={f €G(M): z+ zf(z) Ee C(M)}, 
Af(z) :=zf(z), feD(A),zEM. 


(a) Show that A is a closed operator. 
(b) Show that o(A) = M. 
10.9 In C[0,1] consider the linear operators A, f := f’ and Ay f := f’ with domains 
D(A1) :=C™*(0, 1] and D(A2) := C2(0, 1). 
(a) Show that A, is closable and find the domain of its closure. 
(b) Show that A> is closable and find the domain of its closure. 


10.10 Let A be a densely defined closed linear operator from a Banach space X to a 
Banach space Y. Prove or disprove: 


(a) for all T € &(X), the operator AT with domain D(AT) = {x EX: Txe€ 
D(A)} is closed; 
(b) for all T € “(Y) the operator TA with domain D(TA) = D(A) is closed. 
10.11 Let A be a densely defined closed linear operator from a Banach space X to a 
Banach space Y. Prove or disprove: 
(a) for all T € @(X) we have D((AT)*) = D(T*A*); 
(b) for all T € Y(Y) we have D((TA)*) = D(A*T*). 
10.12 Give a direct proof of Proposition 10.42. 
10.13 Give a proof of Proposition 10.24. 
10.14 Let (Q,.4%,) be a measure space, X be a Banach space, and suppose that f : 
Q — X is Bochner integrable with respect to LL. 


(a) Prove Hille’s theorem: If A is a closed linear operator in a Banach space X, f 
takes its values in D(A) p-almost everywhere and the L-almost everywhere 
defined function Af : Q — X is -Bochner integrable, then f, fdu € D(A) 


and 
Af fau= f ara. 


Hint: Show that @++ (f(@),Af(@)) is Bochner integrable as a function with 
values in X x X and hence, by the result of Problem 1.26, as a function with 
values in the graph G(A). 

(b) Justify the identity 


d fi 1g 
Hal f(t,s)ds= f 9, fs) ds 


by providing conditions on f so that the result of part (a) can be applied. 
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10.15 Show that the results of Problems 9.3 and 9.4 extend to unbounded selfadjoint 
operators. 

10.16 Prove the claims in Examples 10.60 and 10.61. 

10.17 Combining Examples 10.39 and 10.61, find the projection-valued measure of the 
Laplace operator A on L(IR), viewed as a selfadjoint operator in this space with 
domain D(A) = H?(R¢). 

10.18 Let A be a normal operator with projection-valued measure P, and let B C o(A) 
be a bounded Borel subset. Show that Pgx € D(A) for all x € H. 

10.19 Let A be a selfadjoint operator with projection-valued measure P. Show that for 
all u € C\R we have the following formula for the resolvent of A: 


R(u,A) = i= pag th: 


10.20 Let A be a normal operator with projection-valued measure P, and let f,g € 
By(o(A)). Show that if f = g P-almost everywhere (in the sense that there is a 
Borel set N such that Py = 0 and f = g on CN), then f(A) = g(A). 

10.21 Let A be a normal operator. 


(a) Show that A is bounded if and only if o(A) is bounded. 
(b) Find necessary and sufficient conditions on a given Borel function f on o(A) 
in order that f(A) be bounded. 


10.22 Let A be a normal operator and let f be a Borel function on o(A). Show that 

f(A) is injective with dense range if and only if f 4 0 P-almost everywhere, 
where P is the projection-valued measure of A, and that in this case we have 
(f(A)! = (1//)(A). 
Hint: Explain how (1/f)(A) can be defined through the measurable functional 
calculus. Then use Theorem 10.48 to check that D((1/f)(A)(f(A)) = D(f(A)). 
Conclude that (f(A))~! C (1/f)(A). To get the reverse inclusion apply the pre- 
ceding to f~! instead of f. 


11 


Boundary Value Problems 


Having developed some of the core results of Functional Analysis, we now turn to 
applications to partial differential equations. This chapter is concerned with boundary 
value problems. 


11.1 Sobolev Spaces 


We begin by developing some elements of the theory of Sobolev spaces. Our aims are 
relatively modest, in that we only discuss those aspects of the theory that are needed for 
the purposes of the present chapter. 

Throughout this chapter we assume that d > 1 is an integer and D is a nonempty open 
subset of R¢. 


Multi-Index Notation A d-tuple @ = (@,...,0q) € N@ is called a multi-index of di- 
mension d. Its order is the nonnegative integer 
[Oc] = Gy +--+ + Oy. 
We also define 
QO! := ay!---aq!. 


We write a < B if a; < B; for all j = 1,...,d, and in such cases we define @— B = 


(a — B1,...,@4— Ba) and 
(5) eee 
By BG Bt 


For x € R?@ and a@ € N¢ we write 
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Similarly we define 
uae ek oneatc a 


where 0; is the partial derivative in the jth direction. By a standard Calculus result, for 
C*-functions the order in which the derivatives are taken is unimportant. 


Test Functions By C*(D) we denote the space of functions f : D > K having continu- 
ous derivatives 0% f on D of all orders a € N4 and by C2(D) its subspace consisting of 
all functions compactly supported in D; recall that this means that the closure of the set 
{x €D: f(x) £0} is a compact set contained in D. Elements of C3°(D) are referred to 
as test functions on D. The existence of test functions with various additional properties 
is established in Problem 2.9. 

In the same way one defines the space C'(D), k € N, as the space of functions f : D> 
K having continuous derivatives 0“ f on D of all orders a € N¢ satisfying |a@| < k (with 
the convention that C°(D) = C(D)), and C*(D) as its subspace of compactly supported 
functions. 

By C”(D) we denote the space of all functions in C”(D) and with the property that 
0° f has a continuous extension to D for all @ € N@ The spaces C*(D) are defined 
similarly, by considering only the multi-indices satisfying |a@| < k. 

A measurable function f : D — K is called locally integrable if its restriction to 
every open set U with compact closure contained in D is integrable. The space of all 
locally integrable functions f : D — K is denoted by Lee (D); as always we identify 
functions that are equal almost everywhere. In our study of weak derivatives we need 
the following result on convolutions. 


Proposition 11.1. Let k be a nonnegative integer. If f € Bic (R“) and g € CK(R®), then 
the convolution f * g is pointwise well defined and belongs to C(R®), and we have 


O° (fxg) = f* (0g) 
for all multi-indices a € N¢ satisfying |a| < k. 


This proposition may be proved in exactly the same way as Lemma 4.59, but it is 
instructive to give a proof by mollification here. 


Proof First note that the convolution integrals defining f * g and f « (0%g) are point- 
wise well defined as Lebesgue integrals. 

Step 1 — We begin with the case k = 0. Let f € L}.(IR“) and g € C,(R“) be given. 
Choose r > 0 such that the support of g is contained in the ball B(0;r). By uniform 
continuity, given € > 0 there exists 6 > 0 such that for all u,u’ € R@ with |u—u'| < 6 
we have |g(u) — g(u’)| < €. Hence, for all x,x’ € R¢ with |x—x'| < 6, 

(f+8)@)— (Fea) < [ LFO)lee—y) —8’ —y)ley 


Rd 
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. lates [fO)Ila@—y) — al’ —y)| dy 


<ef plore 
B(xyr+26) 


noting that g(x — y) = g(x’ —y) =0 for y ¢ B(x;r +26). This proves the continuity of 
f *g at the point x € R4. 

Step 2 — Next consider the case k = 1. Fix 1 < j < d and let e; € IR@ denote the 
unit vector along the jth coordinate axis. Let r > 0 be such that the support of g is 
contained in the ball B(0;r). Given € > 0, choose 6 > 0 such that |u —u'| < 6 implies 
|djg(u) — Ajg(u’)| < €. 1f0 <h< 6, then for all y € R@ we obtain 


h 
(eo + he) a(y)) dja(v)| = Fal Sisto siya 
: 1 [aig +e) ayet)la <e. 


Taking the supremum over y, this shows that 


him | (a +he)) ~eC)) ~BeC]]_=° 


As a consequence, for all x € IR¢ we have 


lim + ((f#g)(x-+ hej) ~ (f*8)(2)) 
= lim ¢ i Ff (y)g(x + hej —y) — g(x—y)) dy 
. 1 
= lim; if onaayf®—Y(8O+ hej) — (0) ay 


7 bese Flx—y)dja(y) dy = hs f(x—y)djg(y) dy, 


where the penultimate step is justified by the uniform convergence of the difference quo- 
tient and the fact that f is integrable on bounded sets. This proves the differentiability 
of f *g in the jth direction, with 0;(f * g) = f * (0jg), and the derivative is continuous 
by Step | applied to the function djg € C.(R?). 


Step 3 — The result for k > 2 follows by repeating the argument of Step 2 inductively. 


The following version of Theorem C.11 will be useful. 
Proposition 11.2 (Smooth partition of unity). Let 


FCU,U---UUk, 
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where F C R¢ is compact and the sets U;S R? are open for all j =1,...,k. Then there 
exist nonnegative functions f; € CE (Uj), j = 1,...,k, such that 


fit: +f =lonF. 
Here, we think of the functions f; as elements of C2(R“) with support in Uj. 


Proof Taking intersections with an open ball containing F’,, there is no loss of general- 
ity in assuming that the sets U; are bounded. Since F is compact and the sets U; are open, 
there exists a 5 > 0 such that Fs C UP U---UUR, where Fs := {x € R¢: d(x,F) < 5} 
and U iS = {x €U;: d(x,CU;) > 5}. Theorem C.11 provides us with nonnegative con- 
tinuous functions g; : R? — [0,1] supported in ue such that 


Si t--- +g, =1o0n Fs. 


Choose a nonnegative test function @ € C2(IR“) with compact support in the open ball 
B(0;6) and satisfying Jpu @ dx = 1. The functions f; := g;* are smooth by Proposition 
11.1 and have the desired properties. 


11.1.a Weak Derivatives 


In order to make the body of theorems in Functional Analysis applicable to the theory of 
partial differential equations it is desirable to be able to discuss derivatives of functions 
in L?(D). The difficulty is that for such functions, the classical pointwise definition of 
differentiability through limits of difference quotients does not make sense since their 
values are well defined only almost everywhere. This necessitates an approach that is 
insensitive to redefining functions on sets of measure zero. Such an approach is provided 
by the notion of a weak derivative. With the help of weak derivatives we then introduce 
the class of Sobolev spaces, which provides the L”-analogues of the classical spaces of 
continuously differentiable functions. 

If f € C*(D), then for all test functions @ € C2(D) and multi-indices @ € N¢ with 
|a| < k we have the integration by parts formula 


[fevatoar=(-y" | elayo(ayas, (1) 
D D 


where g = 0% f. Using a smooth partition of unity (Proposition 11.2), the proof of this 
identity can be reduced to the situation where the support of @ is contained in an open 
rectangle contained within D; for such @, the formula follows by separation of variables 
and integration by parts on intervals in dimension one. 

This motivates the following definition. 
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Definition 11.3 (Weak derivatives). Let f € Ll,.(D). A function g € L}.,(D) is said to 
be a weak derivative of order a € N@ of f if for all 6 € C?(D) we have 


[featoajae=(-1) | e(so(a)ar (11.2) 
D D 


A function f € L},,(D) is said to be weakly differentiable of order k if it has weak 
derivatives 0% f € L{,,(D) for all multi-indices satisfying |a| < k. 


Remark 11.4. The definition of a weak derivative of order @ can equivalently be stated 
by using functions @ € Ck(D) for any integer k > |a|. To see this, suppose that g € 
L},.(D) is a weak derivative of order a for the function f € Lj,,(D). We wish to prove 
that the integration by parts formula (11.1) holds for functions @ € Ck(D). To this end 
we claim that there exist functions @, € C“(D) such that ¢, > @ and 0%@, > 0%¢ 
uniformly. Once this has been shown, (11.2) follows from 


fre \a%o x)dx= tim | f(x) JO bn (x) dx 


=i! Fim. f e(s)on(syar = (—1)*' | e(x)o(a)av 


To prove the claim, let 7 € C2 (R“) be supported in the unit ball B(0; 1) of R¢ and satisfy 
Ja dx = 1. For n > 1 let 7) (x) = nn (nx). We extend @ identically zero outside D 
and define, for y € D, 

dn(y) = *b(y) = | n™(y—x)o(a) de. 


Rd 


Since 1) (*) is supported in B(0; t), for sufficiently large n the functions @, are compactly 
supported in D. They are also smooth and hence belong to C’(D) by Proposition 11.1, 
and the desired convergence properties follow by elementary calculus arguments. 


The following proposition implies that weak derivatives, if they exist, are necessarily 
uniquely defined up to a null set. This allows us to speak of the weak derivative of order 
@ of a function f, and denote it by 0% f. The proposition could be proved along the 
lines of Lemma 4.59, but it will be instructive to present a proof based on mollification. 

In what follows we write 


UED 
to express that the closure of U is compact and contained in D. 


Proposition 11.5. /f a function g € L|,.(D) satisfies 


J soax=0 
D 


for all @ € C2(D), then g = 0 almost everywhere on D. 
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Proof Let B be any open ball such that B € D and let w € C3(D) satisfy y = 1 on B. 
Pick a mollifier 1 € C2(R“) satisfying {pa dx = 1. For y € R4 set te y(x) := Ne(y—x), 
where Ne(x) = €-¢n(e7!x) for x € R@. Extending y and g identically zero outside D 
and noting that New € CE(D), the assumption implies that for all y € R¢ we have 


Ne * (Wg) ( y)= [neva x) x) dx = [nes g(x) dx = 0. 


By Proposition 2.34 we have ne * (Wg) > weg in L'(R®) as € | 0. Passing to an almost 
everywhere convergent subsequence, it follows that yg = 0 almost everywhere on R¢ 
and therefore g = 0 almost everywhere on B. Since this is true for every open ball B € D, 
the result follows. 


The following simple observation will be used repeatedly without further comment. 
Lemma 11.6. If f € L},.(D) and a € N¢ is a multi-index, then: 


(1) if f has a weak derivative of order a and D’ is a nonempty open subset of D, then 
J \p: has a weak derivative of order & given by 


a" (flo) = ("Alo 


(2) if f has a weak derivative g of order a and g has weak derivative h of order B, then 
f has a weak derivative of order & + B given by h, that is, 


dF (a%f) = ar Ff. 


Proof (1): We consider only test functions @ € C2(D’) in (11.2). Extending them 
identically 0 to test functions defined on all of D, we obtain 


[,fejatear= f reja%o (war 
=(-1) f e(x)o(a)ae= (1)! [g(x)6(x) de. 
(2): For all @, yw € C2(D) we have 


[feveto@ax= (1) f molar 
[s (x)0® w(x) (—1)0 f neyy 


and the result follows by applying the first identity with @ = a8 y 


and 


Example 11.7. The classical integration by parts formula (11.1) says that functions in 
C*(D) are weakly differentiable of order k, with weak derivatives given by their classical 
derivatives. 
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Example 11.8. The function f(x) = |x| has a weak derivative on R, given by 
e 1, x> 0, 
sign(x) = 
-1, x<0. 


This follows from 


[-iorear= [x0 ae— [20a 
-- [9 are [ol Q(x x= =f sign(x)@ (x) dx. 


A far-reaching generalisation of this example is given in Theorem 11.23. 


Example 11.9. We claim that the function f(x) = sign(x) has no weak derivative on R. 
Suppose, for a contradiction, that g € Li. (R) is a weak derivative of f. The restrictions 
of g to R, and R_ are weak derivatives of the corresponding restrictions of f. But 
these restrictions, being constant functions, have classical derivatives both equal to 0. 
Since classical derivatives are weak derivatives, it follows that g=0 on both R, and 
IR_ almost everywhere and therefore g = 0 on R almost everywhere. We then arrive at 
the contradiction that, for all test functions @ € C?(R), 


0=— |" e)6x)= S __Sign(x)9!(x) dx 
= [ o'ear— f° o'ar= 0- 9(0)) -(6(0)-0) = -26(0), 


This proves the claim. 


We have the following version of Proposition 11.1. In its statement, we think of f as 
being defined on R? by zero extension. 


Proposition 11.10. Let f € The (D) have a weak derivative of order & on D. Suppose 
that n € C2 (IR¢) has support in B(0;r) for some r > 0, and let the open set U € D satisfy 
d(U,AD) > r. Then the function n * f has weak and classical derivatives of order o& on 
U, and both are given by 


dO" (fxg) = (O%f)*g = f * (0%). (11.3) 


Proof Proposition 11.1 shows that n * f € C*(IR“) and 0%(n * f) = (0%) « f with all 
derivatives classical. 
For all @ € C?(U) we have, using Fubini’s theorem twice, 


[axfear%ewar= [nx faja% (x) ax 
U R 
= | (fi node») ay) O%9(x) dx 


Rd 


es Boundary Value Problems 
= [.n0)([,,fe-x)90(@) dx) ay 
= from MODUL Fe-212%9(2)4x) ay 
® (yal Des iy) (/; 3 Fle —y)o(x) ax) a 
= (=) [ney (f,a%.Ae—y)0(a)) aay 
ge) ( es n(y)d“f(x—y) dy) dx 


Here we used subscripts x and y to express the variable with respect to which the deriva- 
tives are taken. The identity («) is justified by the assumptions that @ is supported in 
U, 7 is supported in B(0;r), and d(U,dD) > r, and therefore ¢(- +) € C.(D) for all 
y € B(0;r). This proves that 1 « f has a weak derivative of order @ on U given by 
(11.3) 


In the proof of the next proposition we will use the fact that for all 1 < p < ~, the 
operator 


fears 
is closed as a linear operator in L?(D) with domain 
D(d%) := {f € L?(D) : f has a weak derivative of order a in L?(D)}. 


This domain of course depends on p, but we suppress this from the notation. To prove 
that 0% is a closed operator, suppose that f, > f in L?(D), with f, € D(0®%) for all n, 
and 0% f,, + g in L?(D). We must prove that f € D(d%) and 0° f = g. For all @ € C>(D) 
we have 


[fratoae=(-1)" | a% har. 


Passing to the limit 7 — o in this formula (which is possible by Hélder’s inequality, 
thanks to the fact that test functions belong to L4(D) with ; + 7 = 1) we obtain 


[fatoa=(—1) | ep ar. 


This means that the function g € L?(D) is a weak derivative of f of order a. 
By L?..(D) we denote the space of measurable functions whose restrictions to all sets 
U € D belong to L?(U), identifying functions that are equal almost everywhere on D. 
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Proposition 11.11. Let 1 < p<. A function f € LP .(D) admits a weak derivative 
of order @ if and only if there exist a sequence of functions fy, € C2(D) and a function 


g €L? (D) such that for all open sets U € D we have: 


loc 


(i) fr f in LP(U); 
(ii) 0% fn > g in L?(U). 


In this situation we have 0% f = g. 


Proof ‘If’: Let U & D. Since 0® is closed as an operator in L?(U), (i) and (ii) imply 
that f € D(0%) and 0% f = g. If U; € D and U2 € D are open sets with nonempty 
intersection, the resulting weak derivatives g; € L?(U,) and gz € L?(U2) agree on Ui 
U2 by Proposition 11.5. Hence by piecing together these weak derivatives we obtain a 
well-defined function g € L?.(D). Since every test function is supported in one of the 
sets U under consideration, g is seen to be a weak derivative of order & for f. 


‘Only if: For r > 0 let D, = {x € D: d(x,dD) > r}. For every € > 0, choose We € 
Co( #) such that 0 < We <1 pointwise, ye = 1 on Dog, and We = 0 on CDg. Let n € 
C.(R“) have support in B(0; 1) and satisfy fea 7 dx = 1, and define ne := €~4n (e~!x). 
For n > 1 define 


fri= Wi ln’ (f*M1/n), 


where we think of f as a function on R@ by zero extension. Then f, € C?(D) and, by 
Proposition 11.10 and the classical product rule, 


2h, = Vin (22 * Nya) + ¥ (G) (vin) B&F (rem) 
0<B<a 
B40 


We will prove that the functions f, have the required properties, with g = 0% f. Given 
an open set U € D, let N > 1 be so large that U C Do jy. For all n > N we have Yin = 1 
on U and 


ty (s) (F#m jn)() = tw) [ £—y)m jal) dy 
= I(x) f Avssyoajmy 9 F@—y) jal) dy 
= 1y(x)- (lu +a¢./iyf) * Mn)(x), x ED. 
with U + B(0;1/N) © D. Since 1y+8(0.1~)f € L?(D), by Proposition 2.34 we obtain 
Ly fn =u Wi jn (fF * Mn) = Lu - (u+so1/myf)* Mn) > Wl +0. f = uf 


as n — ce, with convergence in L?(D). Also, for all n > N we have 0°%(f * jn) = 
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(0% f) *11/, on U, as well as By in =OonU forall0 < B < a with B £0. Therefore, 
by the same reasoning, 


1y0" fn =v - (Au +aos/nyO"f) * 1 /n) > lua" f 


as n — ce, with convergence in L?(D). 


As an application of this proposition we prove the following result on existence of 
lower-order weak derivatives. 


Theorem 11.12 (Existence of lower order weak derivatives). [f a function f € ie (D) 
admits a weak derivative 0% f for some a € N4 then it admits weak derivatives ob f for 
all B € N@ satisfying 0 < B < a. If both f and 0%f belong to L?(D), then so do the 
weak derivatives OF f. 

For notational simplicity we give the proof only in dimension d = 1; the argument 
carries over without difficulty to higher dimensions. The crucial step is contained in the 
next lemma. 


Lemma 11.13. For all k > | there is a constant Cy > 0 such that for all f € C2(IR) and 
l1<p<owe have 


ae" Fla < Cel Fly + 112“ F lp): 
Proof Let € €C>(R) satisfy ¢(0) = 1 and ¢’(0) = ¢”(0) = 0. Combining the identity 


££) f(e+te1)) = O'(t)of(xt+ter) +O" (t)f(x+te1), 


which follows from 4 f(x+te,) =0f(x+te1), with the identity 


© (£()af(e+te1)) = (10° f(x tte) +O (Haf(x+te1), 
which follows from 40 f(x+te) = 07 f(x+te1), we arrive at 
d 
yo Of + te1)) 
= C (1) 0? fetter) + C(O) flatter) ~ oF e+). 


Upon integrating and using that € € C3°(R) and €(0) = 1, 
ioe = [oa reerterare f° £"(t) flat te) dt. 


If f € C3(R) we can apply this identity with f replaced by of. After an integration 
by parts we obtain 


2 f(x) = - ie £(t)03 f (xt te) dt +f" t" (taf (x+te1) dt 
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=~ [o(nya°fetteryar— f° 6) e) fletter)ar 
Continuing inductively, for f € CK(IR) with k > 2 we arrive at the identity 
ae Fix) =- [5 C(t)O* f(x+te1) dt +(— a 6 (t) f(x+te,) dt. 


Taking L?-norms and moving norms inside the integral using Proposition 1.44, we ob- 
tain 


"fly < i. “(SEC ODALC Htev)Ilp + 1lO*FC +2€1)Ilp) 
LIE [loo +S leo) (If llp + lO" fll), 


where L is the length of an interval containing the compact support of €. 


Proof of Theorem 11.12 Again we limit ourselves to dimension d = 1 for notational 
convenience, and set k := a. The cases k = 0,1 being trivial, we assume that k > 2. 
Let fe Li.(D) have weak derivative o*f. It suffices to prove that f has a weak 
derivative o*—! f in ‘sae (D), for once we know that, lower-order weak derivative are 
obtained by applying this result several times. 
Let (fn)nz1 be a sequence in C?(D) as stated in Proposition 11.11, that is, for all 
open sets U € D we have: 


@) fr fin L'(U), 
(ii) OF f, > gin L'(U). 


By the lemma, for all 1 < p < ~ we have 


a fn 9" finll < Ce(ILyfn = nll + 10" fn — O* finlly)- 


Since the right-hand side tends to 0 as m,n — ©, by completeness there exists a function 
gu € L'(U) such that limy_,.. 0! f,, = gin L!(U). As in the proof of Proposition 11.11 
these functions may be glued together to a well-defined function g € Li.(D), and this 
function is easily checked to be a weak derivative of order k— 1 of f. 

If f and O*f belong to L?(D), and (fy)n>1 is a sequence in C?(D) such that (i) and 
(ii) hold with L'(U) replaced by L?(U), then by the estimate of the lemma, the functions 
gu belong to L’(U) with norm ||gu||z»~v) < Cell f|lzow) < IF ||»). This implies that 
also |/g||L>(p) < Cell fllz-w) 


As an application we prove the following product rule. 


Proposition 11.14 (Product rule). If f € Li,.(D) admits a weak derivative of order 
a, then so does pointwise product wf for every yw € C*(D), and we have the Leibniz 
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formula 


(f= T (5) N00%-Fe). 


0<B<a 


The lower-order weak derivatives 0° f exists thanks to Theorem 11.12. In most appli- 
cations, however, the function f belongs to suitable Sobolev spaces and the existence of 
all weak derivatives up to a certain order is assumed. The proposition admits variations 
in terms of the assumptions on f and y (see, for instance, Problem 11.12). 


Proof We only need to prove this for multi- indices of order one, since the general case 
then follows by an induction argument. For 1 < j < d. Since for all @ € C?(D) we have 
gw € CS (D), the classical product rule for yo ne 


[i (v2)20) +2;v)0) Fa) de=— f yipo()9/Flw) ae. 


After rearranging, this says that the weak derivative 0;(yf) exists and is given by 
(jw) f + waif. 


For the proof of the next proposition we need a simple lemma on extensions. 


Lemma 11.15. Let f € L},.(D) be supported in an open set U & D and let D be an 
open set containing D. If f admits a weak derivative 0% f on D, then the zero extension 
f of f to D admits a weak derivative 0% f on D, given by the zero extension of Of. 


Proof Fix an arbitrary test function 1 € C™ (D D) with support in D and such that 7 = 1 
on U. For all ¢ € C?(D), the (classical) derivatives 2% and 0%(n@) agree on U, and 
therefore 


Fa _ a — /_1) lel a — (_1)|e Taf 
[faroax= | fa%(ng)ax = (-1) [ (a% near =(-1)" [a4 oar 


since 1@ € C?(D). 


The gradient of a weakly differentiable function f is the function Vf € ae (D;K¢ 
defined by 


Vf :=(Af,...,0af). 


As a linear operator from L?(D) into L?(D;K¢) with domain W!?(D), the weak gra- 
dient is closed; this is proved in the same way as we did for weak derivatives. Here, 
L? (D;K“) denotes the Banach space of all functions g : D > K@ with measurable com- 
ponent functions such that x + |g(x)| = (D4 |g;(x)|?)'/? belongs to L? (D), identifying 
functions that are equal almost everywhere. The proof that |/g||;»(p:x«) = |||8(-)IIIz-() 
defines a norm which turns L?(D; IK“) into a Banach space is the same as in the scalar- 
valued case (see Problem 2.27 for the more general case of vector-valued L?-spaces). 
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An open subset U of R¢ is said to be (pathwise) connected if any two points x,y € U 
can be joined by a continuous curve in U, that is, there exists a continuous mapping 
g : [0,1] > U such that 9(0) =x and g(1) =y. 


Proposition 11.16. Let f € Li,.(D) be weakly differentiable. If V f = 0 almost every- 
where on an open connected subset U of D, then f is almost everywhere equal to a 
constant on U. 


Proof Let B be an open ball whose closure is contained in U and choose a test function 
w € C2(D) such that y= 1 on B. By Proposition 11.14, wf is weakly differentiable on 
Dand V(wf) = fVw+ wVf. In particular, V(wf) = 0 on B. 

Extending f and y identically zero to all of R4 by Lemma 11.15 the function wf is 
weakly differentiable as a function on R%, with a weak gradient in 1 (R? ) that equals 
the zero extension of the weak gradient of yf on D. By slight abuse of notation, both 
will be denoted by V(w/). 

Pick mollifier 7 € C2?(R“) satisfying fan dx = 1, and set Ne(x) := € 
x € R@. By Proposition 11.10 we have ne * (wf) € C*(R®) and 


V(ne* (Wf)) =Ne*V(Wf) on Rt 


In particular, V (Ne * (wf)) = 0 on B. Since the function nN, * (wf) is C®, classical cal- 
culus arguments imply that this function is constant on B. Taking the L!(IR¢)-limit as 
€ | 0 using Proposition 2.34, and passing to an almost everywhere convergent subse- 
quence, it is seen that yf is constant almost everywhere on B. Indeed, this shows that f 
is the almost everywhere limit of a sequence of functions each of which is constant on 
B. Since y = 1 on B it follows that f is constant almost everywhere on B. This being 
true for every open ball B contained in U, this gives the result. 


ates), 


11.1.b The Sobolev Spaces W*? (D) 
By Hélder’s inequality, every f € L?(D) with 1 < p < ~ belongs to L},,(D). This sug- 


gests the following definition. 


Definition 11.17 (Sobolev spaces). For k € N and 1 < p< « the Sobolev space wke (D) 
is the space of all f € L?(D) whose weak derivatives of all orders |@| < k exist and 
belong to L?(D). 


Endowed with the norm 


Ifllweow) = YL llOFlleew), 


|a|<k 


WP (D) is a Banach space. Indeed, if (f,)n>1 is a Cauchy sequence in W*?(D), then 
for all |a@| < k the sequence of weak derivatives (0% f,)n>1 is a Cauchy sequence in 
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L?(D) and hence convergent to some f\. Set f := f\). Using Hélder’s inequality as 
in Corollary 2.25, we may pass to the limit n — in the identity 


[fate (war=(-1)4 f Kolar, OECD). 
D D 


It follows that f has weak derivatives of all orders |@| < k given by 0% f = f(™. Having 
observed this, it is clear that f € W*?(D) and f, > f in W*?(D). 

For | < p<, the linear operators 0 are densely defined; this follows by noting that 
functions in C2(D) belong to D(0“) and recalling from Proposition 2.29 that C2(D) is 
dense in L?(D). This prompts the question which functions in W*? (D) can be approxi- 
mated, in the norm of this space, by test functions. A first result into this direction is the 
following lemma. 


Proposition 11.18 (Local approximation by test functions). For all f ¢ W!?(D) with 
1 < p< there exists a sequence of test functions fy, € C2(D) such that for every open 
set U € D we have 


lim falu = flu in W'?(U). 
n-oo 


Proof This follows from Proposition 11.11, noting that the functions /,, constructed in 
the proof will handle the L?-convergence of all weak derivatives simultaneously. 


Proposition 11.19 (Chain rule). Letp:K— KbeaC '_function with bounded deriva- 
tive and satisfying p(0) =0, and let 1 < p< ~. For all f © W'?(D) we have pof € 
W!?(D) and 


A(pof)=(pof)ojf, jHl,..yd. 


Proof The condition p(0) = 0 and the boundedness of p’ imply that |p (t)| < M|t| with 
M := sup,cx |p’(t)| and therefore f € L?(D) implies that p o f € L?(D). The bound- 
edness of p’ implies that p’o f is bounded and therefore 0;f € L?(D) implies that 
(p'0 f)0;f € L?(D). To conclude the proof it therefore suffices to check that (p’ o f)d;f 
is a weak derivative in the jth direction of po f. 

Fix @ € C>(D) and let U € D be an open set containing the compact support of @. By 
Proposition 11.18 we can find functions f, € C?(D) such that f,|y > f\y in W!?(U). 
Since |p o fx —pof| < Mf, —f| and f, — f in L?(U), Hélder’s inequality and the 
classical chain rule imply that 


[,p2Fajeax= lim | po f.djp dx =— lim | (p! 0 fy)(9jfu)o de 


Upon passing to a subsequence if necessary, by Corollary 2.21 we may assume that f;, > 
f and 0; f, — 0;f almost everywhere on D and that there exists a function 0 < g € L?(U) 
such that |0;f,| < g almost everywhere on U for every n. Then |(p'° fn)(Ajfn)| < 
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M\g||@| almost everywhere on U for every n, and since |g||@| € L'(U) we may use 
dominated convergence to obtain 


lim | (poh) dj.foax= | (p'o N(A,/oak. 


n—-oo 


noting that on both sides the integrands vanish outside U. This completes the proof. 


Proposition 11.20 (Substitution rule). Let D and D! be nonempty open subsets of R4 
and suppose that p : D — D! is a C*-diffeomorphism with k > 1. Let 1< p<. A 
function f € L},,(D) belongs to if and only if fop~! € L},,(D’) is weakly differentiable. 
Denoting the space variables of D and D! by x and y, respectively, in that case we have 


d ) 1 
pjA(fop —-) 
=> ax; dy; 


OX; 


Proof For smooth functions f this is the substitution rule from Calculus, and the gen- 


eral case follows by local approximation with smooth functions. 


11.1.¢ The Sobolev Spaces W,’” (D) 


We now introduce a class of spaces which play an important role in the theory of partial 
differential equations, where they provide the L’-setting for studying boundary value 
problems subject to Dirichlet boundary conditions. 


Definition 11.21 (The spaces Wy?(D ?(D)). For 1 < p <0 we define Wy (D) to be the 
closure of C?(D) in W!?(D). 


The following result gives a simple sufficient condition for membership of Wo P(D): 


Proposition 11.22. iat U €D be open and let 1 < p<». If f © W'?(D) vanishes 
outside U, then f € W, i P(D). 


Proof Since U is compact and does not intersect 0D we have 6 := d(U,dD) > 0. 
Fix a function n € C2(IR“) compactly supported in the ball B(0;1) and satisfying 
ea 1 dx = 1, and set Ne(x) = €~4n(e~ |x) for 0 < € < 5. Each ne has compact support 
in B(0;€). Extending f identically zero outside D, the condition 0 < € < 6 implies that 
the convolution nz « f is compactly supported in D. 

As € | 0, by Proposition 2.34 we have 


Nex f > f in L?(R“), and hence in L?(D). (11.4) 
By Proposition 11.10 the weak derivatives of nN, « f are given by 
Aj(me*f)=netOjf, JH lyd. 


These weak derivatives belong to L?(D), so Ne * f restricts to an element of W!?(D). 
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By Proposition 2.34, 
Oj(Ne* f) = Nex Ojf > Ojf in L?(R“) and hence in L?(D). (11.5) 


By (11.4) and (11.5) we have ne * f + f in W!(D). Finally we observe that ne « f is 
C® (by Proposition 11.1) and compactly supported (since both nN, and f have compact 


support). 


It is trivial that if f €¢ W'?(D), then Ref, Imf € W!?(D), and similarly if f € 
Wy P(D), then Re f, Imf € Wy ’P(D). The next proposition asserts that the positive part 
f* of a real-valued function f in W!?(D) belongs to W!-?(D) and provides an explicit 
expression for its weak derivatives; moreover, if f € A P(D), then f* € Wo (D). The 
analogous results then also hold for the negative part f~ = f* —f and the absolute 
value |f| = ft+f-. 


Theorem 11.23 (Positive parts). Let 1 < p < «. Then: 
(1) for all real-valued functions f € W'?(D) we have f* © W!?(D) and 
af =1ppn0if Foland 


and if f € Wy’? (D), then f+ € Wy’? (D); 
(2) the mapping f + f+ is continuous with respect to the norm of W!? (D). 


Proof The idea of the proof is to approximate the function p :t > tt = max{t,0} with 
C!-functions p, in such a way that for all t € R we have 


(i) O< Pni(t) Aprgae 
Gi) 0 < pj (t) t 10.0) (t). 


For instance, the choice py(t) = (t? + 5)? —} fort > 0 and p,(t) = 0 for t < 0 will 
do; see Figure 11.1. 


Step 1 — Let f be a real-valued function in W!?(D). By Proposition 11.19 we have 
Pno f © W!(D) and, for test functions @ € C2(D), 


— [one f)(x)aj0(s)dx= [ph 0 f)(x)9j/)0() de 


By (i), (ii), and dominated convergence we obtain 


— fF )a;0(ae= [140 ALO) ax 


This proves the first part of (1). 


Step 2 — To prove (2), let f, > f in W!?(D) with all functions real-valued. We must 
show that f;* > f+ in W!?(D). It is clear that f;* + f* in L?(D), so it remains to 
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fQar 


po(t) = (1? +4)1/2-4 
pi(t) =(24+1)!/2-1 


Pob 


Figure 11.1 The functions p,(t) = (¢? + 4)!/? — i 


n 


show that 0;f,;' > 0;f* for all 1 < j < d. By Step 1 this is equivalent to showing that 
1, 4,30} 9 fn > 1,50} O;f in L?(D) for all 1 < j <d. 

By Corollary 2.21 we can choose a subsequence such that f,, — f and | fn,| < g 
almost everywhere, where 0 < g € L?(D), and Oj fy, > O;f and |; fn,| < hj almost ev- 
erywhere, where 0 < h; € L’?(D), for all 1 < j < d. Then My, +0} — 1,ps0} almost ev- 
erywhere, so dominated convergence implies that 1, fu, >0} Oj fn, > [ppsoy0jf in L?(D) 
forall <j<d. 

Applying the above argument to subsequences of (f)n>1, we thus find that every 
subsequence (fn,,)m>1 Of (fn)n>1 contains a further subsequence (f;,,,,)k>1 such that 
Fring — f* in W!?(D). This implies that f,* > f* in W!?(D). 


Step 3 — It remains to prove the second part of (1). Suppose that f € Wo P(D) is 
real-valued; we must prove that f* € WA P(D). Choose functions f, € C(D) such that 
fa > f in W'?(D). Then f,* > f+ in W!?(D) by Step 2. Thus it suffices to show 
that every f," can be approximated by functions in C?(D). This is accomplished by a 
mollifier argument. 

Fix n > 1. Since f, is compactly supported in D, its support has positive distance 
5, to the boundary of D. If 7 € C2?(R“) has compact support in B(0;1) and satisfies 
fea n dx = 1, then for 0 < € < 6, the function ne * f;* (where Ne(x) = en (e7!x)) 
is smooth by Proposition 11.1, has compact support in D, and by Proposition 2.34 we 
have Ne * f;* > f,* in L?(D) as € | 0. Likewise, by (11.3) applied to f;", 


Oj(Ne XI) = Ne * (Oj fn ) => Difn 


with convergence in L?(D). 
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The following proposition connects the space Wo ’(D) with Dirichlet boundary con- 
ditions. 


Theorem 11.24. Let D be bounded and let 1 < p < ©. For all f € W'!?(D)MC(D) the 
following assertions hold: 


(1) if flap =0, then f € Wj)? (D); 
(2) if AD is C! and f € Wy? (D), then flap = 0. 


Proof (1): By considering real and imaginary parts separately we may assume that f 
is real-valued, and by Theorem 11.23 we may even assume that f is nonnegative. Since 
f- i — f in W!?(D) as k + ~ and taking positive parts is continuous in W!?(D) by 
Theorem 11.23, we see that fe := (f—{)* > ft =f in W!?(D) as k — 0. Hence to 
prove that f € Wo (D) it suffices to prove that fy € Wo (D) for alln > 1. 

By continuity, each f; vanishes in a neighbourhood of the compact set 0D. Then 
Proposition 11.22 shows that f; can be approximated in W!?(D) by functions Sine 
Cz (D). 

(2): By the definition of a C!-boundary and a partition of unity argument, it suffices 
to prove that for all f € Wy’? (R24) n C(R¢) we have flared = 0; here, 

R4 := {xe R¢: xq > O}. 

Let gn > f in Wy’? (R4) with on € C.(R4) for all n > 1. For all x = (x’,x) € R4 = 

R42! x (0,6), 
Xd 
\oa(w/sxa)l < [Laat y) la 


Integrating over |x’| < 1 and averaging over 0 < xq < € with 0 < € < 1, we obtain 


bps a qe ) ) 
is n\X , dx dxy < — ra) n(X, dydx dx 
=, toa eal 4 Jo Sqj<1 Jo leada(e .y)I dy . 


Xq 1 E 
— = fe} nN a dx dx’ d 
 Iieaye fate n) ered ay 


€ 
< ff dada(s’,») Idx’ dy. 
0 J{\x/|<1} 


Letting n — ©, we obtain 


1 E€ E 
eh fin Oxalavara< [Pf laaf!.y)lee’ay. 
E JO J {\x'|<1} 0 Y{Ix!|<1} 


Letting € | 0, we obtain 


Joo Me rad] ae =0. 
{h’|<1} 
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This implies f(x’,0) = 0 for almost all x’ € R¢-! and, since f is continuous on RZ, 
f(x,0) =0 for all xX € RO. 


Theorem 11.25. For all 1 < p< ~ we have 
Wy’? (R¢) = Wl (R4). 


Proof Clearly, Wo P(R¢) C W!?(R4). To prove the reverse inclusion we must show 
that every f € W!-?(IR¢) can be approximated in W!-?(IR“) by functions in C?(R). 

Let Wn € C2(R¢) satisfy 0 < YW, < 1 pointwise on R4, y,(E) = 1 for all |E| <n, and 
|\VWn| < 1 on R@. Proposition 11.14 implies that the functions y;, f belong to W!?(R2), 
and by dominated convergence we have w,,f > f and 0;(Waf) = Wn0jf+fOjWn > Of 
in L?(R“) as n + ©, using that 0;W, > 0 and y, > 1 pointwise on R¢ with uniform 
bounds |0;y,| < 1 and |W,| < 1 to justify dominated convergence. It follows that y;,f > 
finW!?(R2). 

Accordingly it suffices to prove that every compactly supported function in W!? (R@ 
can be approximated, in the norm of W!-?(IR¢), by test functions. This was accom- 
plished in Proposition 11.22. 


With the same method of proof one obtain the following density result. 


Theorem 11.26. For all k € N and 1 < p < ~ the space C?(R*) is dense in W*?(R4). 


11.1.d Extension Operators 


Let k € N be an integer that is kept fixed throughout this section. We say that D has a 
C*-boundary, if for every xo € OD there exist open sets U,V C R4@ with x) € U, anda 
C*-diffeomorphism p : U — V with the following properties: 


(i) p(DNU) = {x EV: xq > 0}; 
(ii) p(ODNV) = {x EV: xg =0}; 
(iii) there exists a constant C > 0 such that 
C ! <|det(Dp(x))| <C, xEU, 
where D is the total derivative of p. 
See Figure 11.2. 
In this situation, by Proposition 11.20 applied inductively, a function u € Li .(U ) 


belongs to W*?(U) if and only if the function v:=uop~! € LI. (V) belongs to W*?(V) 
and 


Co lll envy < lleellweeuy < Cllvll ee): (11.6) 
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p 
D DOU —s 


p(Dnv) 


Figure 11.2 The definition of a C'-boundary 


Theorem 11.27 (Density of C*(D) in W*?(D)). If D is bounded with C‘-boundary, 
then for all 1 < p<» the space C®(D) is dense in W*? (D). 


We actually prove the stronger result that for any f € W*?(D) there exists a sequence 
of functions fy € C* (IR) whose restrictions to D satisfy limps. || fn — F\lweo(o) =0. 


Proof The proof proceeds in three steps. The first step deals with the case where D 
is (a bounded open subset of) R4 := {x € R¢ : xq > 0}. The second step, commonly 
referred to as ‘straightening the boundary’, uses the definition of a C*-domain to carry 
over the result of Step 1 to the open sets U in the definition. The third step patches 
together the local results of Step 2 by means of a partition of unity argument. 

Step 1 —In this first step we prove that if f ¢ W*? (R4), then there exists a sequence of 
functions f, € C2 (R“) whose restrictions to R4. satisfy f, + f in W*?(R4) as n> &. 

Let y € C(R“) satisfy 1p(0,1) < Y < 1g¢0.2) and let M:= SUP) q|<k ||” Woo. Then for 
alln > | the function y,(x) := y(x/n) satisfies 1g(0.n) < Wa < 1g(0;2n) and ||O% Yy|loo < 
n!@l||9%yl|.. <M. By the product rule and dominated convergence, this implies that 
Vif > fin WP (R14). It follows that we may assume that f has bounded support. 

For t > 0 define the functions 


Si(x) = f(x+teg), x€ R4, 


where eg is the dth unit vector of R“. Clearly we have f, € W“?(IR4), with weak deriva- 
tives 


O° fi(x) = (O%f)(x+teq), xER4. (1.7) 


The L?-continuity of translations (Proposition 2.32) therefore implies that f; > f in 
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Ww? (R4). As a result, it suffices to approximate each f, in W*?(IR4) with functions in 
C7 (R¢). 

In the remainder of this step we fix f > 0. Let h € W*?(R¢) be a function whose 
restriction to {x € R@ : xy >t} agrees with the restriction of f on that set. Such a 
function may be found by multiplying f with a test function y € C? (R47) satisfying y= 
1 on supp(f) A {x € R¢ : xq >t}; this results in a function with the desired properties by 
Proposition 11.14 and Lemma 11.15. Letting A; (x) := h(x+teq) for x € R4 it follows 
that for almost all x € R4 we have 


(O%h,) (x) = 0% h(x +teq) = 0% f(x+teg) = 0% f,(x). 


Choose a mollifier 7 € C2(R“) satisfying fa 7 dx = 1, and let ne(x) = €~4¢n(e7 |x) 
for € > 0 and x € R@ By Proposition 11.10, the functions /;,¢ := Ne * h; belong to 
C*(IR¢) and for all multi-indices |a@| < k we have 


O*htc = Ne *(O*N). (11.8) 
By (11.7), (11.8), and Proposition 2.34, for € | 0 we then obtain 
\|O% hte _ O* fillir(d) _ I|Ne * O*h; <= "hillp ced ) > 0. 


This gives de desired approximation. 


Step 2 — Let xp € OD be fixed. Choose open sets U,V C R4@ with x € U, anda C*- 
diffeomorphism p : U — V with the properties (i)—(iii) in the definition of a Cé-domain. 
In this step we assume that f € W*?(D) has its support in an open set U EU. 

Let g: VOR4 = R be defined by g := fop |. Then g € W*?(VORZ) by (11.6). 
Since p (U ) EV, the same argument as in the proof of Lemma 11.15 shows that the zero 
extension g of g to all of R4 belongs to W*?(R4). 

By Step 1 we may choose functions g, € C2(R“) such that g, > g in W"?(R4). Fix 
a test function € € C?(V) such that € = 1 on p(U). Then also Cg, > gin W*?(R4), 
and on V NR4 we have ¢g = g. Replacing g, by Cgy if necessary, we may therefore 
assume without loss of generality that g, € C?(V), where V € V contains the support 
of €. Let fn := gn0p. Then f, € C?(U). It follows that f, € C*(IR“) by zero extension, 
and fy, > f in W*?(D) by (11.6). 

Step 3 — Now let f € W*?(D) be arbitrary. Since 0D is compact, as in Step 1 we can 
find open sets U;, and V;, and C*-diffeomorphisms Pm: Um — Vin, m= 1,...,M, as well 
as open sets Um € Um, m=1,...,M, in such a way that 0D C ee Uns. By adding one 
open set Uo € D we may arrange that D C (4 Um. Let (ion be a smooth partition 
of unity for D subordinate to this cover, that is, Mn € C0) form =0,1,...,M and 


M 
Nm = 1 on D, 0< Nm <1, m=0,1,...,M. 


m=0 
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Let fi” ‘= mf. By Lemma 11.15 the zero extension of fO belongs to Ww? (R2), and 
therefore by Theorem 11.25 and we can find f € C*(R¢) such that f BES fO in 
W*?(D). For 1 <m<M the function f (") is supported in Um and by Step 2 we can find 
#0 © C?(R®) such that f0") > f” in WEP(D). 

Finally let f, := ae fi”. Then fy, € C™(R?) and, as n + ©, 


M 
lf fllweow) < YA” —F leew) > O- 
m=0 


The restrictions of f, to D have the required properties. 


The boundary of a C‘-domain is locally the graph of a C*-function in d— 1 ‘hor- 
izontal’ coordinates, and in fact this could be taken as an alternative definition of a 
C*-domain. By translating f in the remaining ‘vertical’ direction, the use of the C*- 
diffeomorphism p and the substitution rule can be avoided and all constructions can 
be carried out directly in U. This is the reason we have been rather brief in explaining 
the fine details of the substitution rule and its use in the present proof. Nevertheless we 
prefer the approach presented here, as it brings out clearly the idea that constructions 
involving the boundary can locally be reduced to hyperplanes {x € R4@ : xy = 0} using 
C*-diffeomorphisms. The advantage of this becomes even more clear in the proof of the 
next theorem. 


Theorem 11.28 (Extension operator). Let D be bounded and have a C*-boundary. Then 
there exists a linear mapping E : L,,(D) > Le (R®) with the following properties: 


(i) for all f € L},.(D) we have (Ef) |p = f: 
(ii) for all 1 < p< and f © W*?(D) we have Ef € W*?(R4); 
(iii) for all 1 < p < © there exists a constant C > 0 such that for all (= 0,1,...,k, 


IES llwerceey <Cellfllwerw), FE W'?(R*). 


Proof We proceed in three steps. 


Step 1 —Let the integer 0 < ¢ < k be fixed. For f € Li,.(R4) and multi-indices |a@| < ¢ 
define Ey.f € Li, (IR“) by 


f(x), xe a. 
Ea = 241 
Fe) Yi (-/)™ejf (x1, .--.%a-1, — Jka); x ¢ RA, 
j=l 


where the scalars c; are chosen in such a way that 
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By the theory of Vandermonde determinants this system of + 1 equations is uniquely 
solvable for c1,...,C¢41- — 

It is easily verified that if f € C'(IR¢), then Eof € C’(R4) (due to the choice of the 
coefficients c;) and 0°Eo f = Egd% f. Thus 
e+ Pp 
Yi (-/)ejf(a1,---,Xa-1,—JXa)} dx 
j=l 


|°EoS le cxe “he , [aside 


Crd 
Cp pllO“ feo cpey 
This gives the bound 
Eo fllweecrt) < Creel fllweoced- 


By Theorem 11.27 this bound extends to functions f € W“?(R4). 


Step 2 — Now let D be bounded and open. By Theorem 11.27 it suffices to prove the 
existence of a linear mapping E : L!,.(D) — L},,(R“) such that for all f € Li,.(D) we 
have (Ef)|p = f and for all 1 < p<, £=0,1,...,k, and f € C2(D) we have 


EF llweoceey < Cll fllweo(o) 


with a constant C > 0 depending only on @, p, and D. 

Let fe Li.(D) and 0 < £< k be given. Using the notation of the proof of Theorem 
11.27, set fin := Nnf, m= 0,1,...,M. Let ho be the zero extension of fo to R¢ and, 
for m= 1,...,M, let g», denote the zero extension to R4 of the function fi, ° a E 
L' (Vin al R¢). Let Ep be the extension operator of Step 1 and define Ef := ys hin, 
where for m= 1,...,M we set 


—— (Eom) © Pm on Un, 
im += 
0 on CU». 


Then Ef = f on D. If we now assume that f € C*(D) and fix 1 < p<, then fo = Nof € 
C*(D) and |!ho lle» (rt) < Cepvllfllwer wy. For m= 1,...,M we have hm € C*(Um) 
with support in Um. Therefore Mm € Cm(Re) and, using a. 6) twice, 


Ilhm|l weep) a I|Am|l wer (p) C1 ||Eo8m|lw0 (Vy < Cj ||E08m|lweocea) 


< 
< C2||8mll ences) = “Cilellyen,ent 
< C3|lfmllwr(uprvy < Cll Fllwe0c) 


It follows that 


M 
EF llwer cay < » I|mll we. (Ra) < Cs|l fllwerc) 
m=0 


with all constants only depending on @, p,D 
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11.1.e Bessel Potential Spaces 


A function f € L?(D), with 1 < co, is said to admit a weak L?-Laplacian if there 
exists a function g € L?(D) such a for all test functions @ € C?(D) we have 


| fAddx = i g@ dx. 

D D 

This defines a linear operator Ap p in L?(D), the weak L?-Laplacian on D, with domain 
D(A, p) :={f € L?(D): f admits a weak L?-Laplacian on D}. 


By the argument of Example 10.16, this operator is closed. In what follows we will 
denote weak Laplacians simply by A. 

Among other things, as an application of the Plancherel theorem the next theorem 
establishes that the domain of the weak Laplacian in L?(R¢) equals W*(R¢). 


Theorem 11.29 (Fourier analytic characterisation, weak Laplacian). Let k > 0 be a 
nonnegative integer. For a function f € L? (R*) the following assertions are equivalent: 


(1) f belongs to W*?(R*); 
(2) E> (1416)? )h7 716) belongs to (BR). 


Moreover, 
fro |lE> a+lsPyFE)Il, 
defines an equivalent norm on W‘?(R¢). For k = 2, (1) and (2) are equivalent to: 
(3) f admits a weak Laplacian in L?(R¢). 
Moreover, f ++ ||f\|2 +||Af||2 defines an equivalent norm on W??(R¢). 


Definition 11.30 (The Bessel potential spaces H*(R“)). For real numbers s > 0 the 
subspace of all f € L7(R“) such that 


Er (14187)? F(E) 


belongs to L7(IR@) is denoted by H*(R¢) and is called the Bessel potential space with 
smoothness exponent s. With respect to the norm 


Fleet = 16 9 A +1EP YP FEI p2cQey (11.9) 
this space is a Hilbert space. The easy verification is left as an exercise. 


For noninteger values of s, the spaces H*(IR@) play an important role in the regularity 
theory for partial differential equations. 
The equivalence of (1) and (2) of Theorem 11.29 asserts: 
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Theorem 11.31 (W*?(R¢) = H*(R¢)). Ifk > 0 is a nonnegative integer, then 
w'2(R¢) = HER?) 
with equivalence of norms. 


The proof of Theorem 11.29 relies on the following lemma, where for vectors € ¢ R¢ 
and multi-indices a@ € N¢ we use the short-hand notation 


a a, 
EO EM. Md 


Lemma 11.32. A function f € L?(IR¢) admits a weak derivative 0% f belonging to 
L?(R®) if and only if E 6 Ef (E) belongs to L?(IR¢), and in that case we have 


d%7(E) = MlE“FE) 
for almost all € € R¢4. 


Proof ‘If’: For test functions @ € C2(R“) and € € R¢ we have 


F916) = ara [,ex0(-it-§)(2* 0) av 


lel a 2 
7 oe [.,exv(-ix-8)0 (a) dx = 1164906), 


(11.10) 


Also, by differentiating under the integral, we see that o is smooth and 


O° G(E) = (a)! (wb (x) )1E). (1.1) 


Suppose now that € > E*F(E) belongs to L7(IR“) and denote its inverse Fourier— 
Plancherel transform by gq. Using the Plancherel isometry and (11.10), for real-valued 
@ € C?(R®) we obtain 


(—1)" f £()9%$ (a) de = eee = (-1)!*(F/9%9) 


= ial fea FEa (Eas =i [gale 


Considering real and imaginary parts separately, the identity extends to complex-valued 
@ € C2(R®). This shows that f has weak derivative 0% f = i!“ gq in L?(R¢). 

‘Only if’: Fix a test function @ € C2(R“). Using Proposition 5.29 and (11.11) we 
obtain 


[PF E96) 48 = [ (a* NEVE = (—1)! [ FG)a%d(E)as 
=iel f fe “pag = if, E*F(E)O(E) aE. 


By Proposition 11.5 this implies 0 f(E) = i“|E% F(E) for almost all € € R%. Since by 
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assumption 0% f € L?(IR¢), the Plancherel theorem implies that auf € L?(R¢). This 
shows that & ++ €% f(E) belongs to L7(R“). 


Proof of Theorem 11.29 We begin with the proof of the equivalence (1)<(2) for gen- 
eral integers k > 0. 
The heart of the matter is contained in the two-sided estimate 


1/2 
(I+1EP)? xan (¥ IEP) (11.12) 
|| <k 


which follows from the binomial iene, Here, the notation A ~qx B is short-hand 
for the existence of constants cg x, c’ a = 9, both depending only on d and k, such that 
Ax Ca,hB and B < cl, A. 


(1)=>(2): First let f € W*?(R“). By (11.12), Lemma 11.32, and Plancherel’s theorem, 


[O+IEP NAEP AE an “he leeF(E)P ae 


|a|<k 
y flere rag 
es 
=)y a OF x)? dx ak IF llive2 cays 
|a|<k 


This shows that € + (1+|€|?)*/2f(E) belongs to L?(R¢), and that its L?-norm is equiv- 
alent to the norm of f in W*?(R¢). 

(2)=(1): Suppose that f € L?(R“) is a function with the property that € + (1+ 
|E|?)4/2 F(E) belongs to L?(R@). Then € + E*f(E) belongs to L?(R“) for all multi- 
indices @ € N¢ satisfying |a| < k, and therefore f has weak derivatives 0%f for all 
|a| < k by Lemma 11.32. This shows that f € W‘?(R¢). 


For k = 2, (1) obviously implies (3). The converse follows from Lemma 11.32. The 
equivalence of norms again follows from the Plancherel theorem and (11.12). 


11.2 The Poisson Problem —Au = f 


The results developed above are now applied to study the Poisson problem. 


11.2.a Dirichlet Boundary Conditions 


We recall from Theorem 11.31 that H*(R¢) = W*?(R¢) with equivalent norms. For 
nonempty open subsets D of R¢ it is customary to define 


H¥(D) :=W*?(D),  Hj(D) = Wo" (D), 
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where Wy ?(D) is the closure of C?(D) in W!*(D). We endow H*(D) with the norm 
Ila = ¥ Wes, 


|a|<k 


where the summation extends over all multi-indices a € N@ of order at most k. This 
norm is associated with the inner product 
(fle) nev) = Dy (0% fla%g)2 
|a|<k 
and it turns H*(D) into a Hilbert space. As a closed subspace of H!(D), the space H¢ (D) 


is a Hilbert space as well. 
Let us now take a look at the Poisson problem with Dirichlet boundary conditions: 


ie =f onD, 


(11.13) 
ulap =0, 


where f € L7(D) is a given function and A = ae a? is the Laplace operator. Multiply- 
ing both sides of the equation (11.13) with a test function @ € C(D) and integrating, 
we obtain the following integrated version of the problem: 


[(anoar=— | foar, 


which, after a formal integration by parts (which can be rigorously justified if u € 
C?(D)), can be rewritten as 


[vu-voar= f roar (11.14) 
D D 
This formal derivation justifies the following definition. 


Definition 11.33 (Weak solutions). A function u € Hj(D) is called a weak solution of 
the Poisson problem with Dirichlet boundary conditions (11.13) if 


[vuvoa= [| foax, @ €C2(D). 


The notion of weak solution makes sense for inhomogeneities in f € L7(D). In the 
special case f € C(D), a classical solution may be defined as a function u € C?(D) 
C(D) satisfying the equations of the Poisson problem, —Au = f on D and ulgp = 0 
pointwise. Classical solutions may not exist, however, even when f € C,(D) (see Prob- 
lem 11.26). The advantage of working with weak solutions is that integrated equation 
in (11.14) involves only the first derivative and both integrals in (11.14) can be inter- 
preted as inner products. This makes the problem of finding weak solutions amenable 
to Hilbert space methods. The requirement that u be in Hi (D) implements the boundary 
condition u| gp = 0, as evidenced by Theorem 11.24. 
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Having admitted membership of Hj (D) as a valid way of implementing Dirichlet 
boundary conditions, one may still be tempted to look for solutions in Hd (D)M H?(D). 
It can be shown that this works (in the sense that a unique weak solution in the sense 
of Definition 11.33 belonging to this space exists) if D is assumed to be bounded with 
C?-boundary (see Remark 11.38). Without additional assumptions on D, however, such 
solutions need not exist. 

Before attacking the problem (11.13) using 
Hilbert space methods, we pause to emphasise 
that in some special cases this problem is sim- 
ple enough to admit a direct elementary solu- 
tion. For instance, for d = 1 and D = (a,b) we 
may integrate the equation twice and determine 
the two integration constants by substituting the 
boundary conditions, which take the form u(a) = 
u(b) = 0. After some computations one arrives at 


y= k(x,y) f(y) dy, x € (a,b), 


where 


7 (b—y) (x— a), x< y, Henri Poincaré, 1854-1912 
k(x,y) ~~ 


(b—x)(y—a), YS, 


is the so-called Green function for the Poisson problem on (a,b) with Dirichlet bound- 
ary conditions. The reader may check (see Problem 11.25) that the function wu thus 
defined belongs to Hi (a,b) and is indeed a weak solution of (11.13). 

It is difficult, however, to extend this elementary method to higher dimensions. In 
contrast, the Hilbert space method adopted here works in arbitrary dimensions. Our 
main tool is the following inequality which is of interest in its own right. 


Theorem 11.34 (Poincaré inequality). Let D be bounded and let 1 < p < ». There 
exists a constant C = Cp p = 0 such that the following estimate holds: 


lf llno(o) < CVF lana), FE Wy’? (D). 


W?(D) = IV Fle» cosa) defines an equivalent norm on wo? (D). 


Proof First assume that f € C?(D). By extending f identically zero outside D we 
may view f as a function in C?(R“). Choose r > 0 so large that D C R := [—r,r]4. 
For 1 < p < and x; € [—7,r] we have, using Hélder’s inequality and the fact that 
f(—4x2,---,X¢) = 9, 


Pp 
| f (x1,x2,---,Xa)|? = [f° 2f 7 (a2). ..,Xq) dt 
=F 
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xX] fa) D 
<tr | SF t.x2,....%4) dr, 

—r xX] 

where + ; = 1. Integrating both side, we obtain 
Ilig = fir Bi aise gta) | ax< | 24 pte +a 

X1 Ip 
= (274 ses rp re (x + r)P/4dx pypta i a al 
= =P ee Ace Ox] 


Since trivially || $e lp <||Vfllp. this gives the desired estimate, with constant Cp p := 
p '/P(2r)'+ (d-1) Vp. 
Since C?(D) is dense in Wo ’P(D), the estimate for a general f € Wo  (D) follows by 
approximation. 
Finally, the equivalence of the norms follows from 


IIVFll> < [Flo HIV lp < (Cp.o +1) IV Flo. 


We now take p = 2 and recall the notation H!(D) = W!?(D) and Hi (D) = Wo (D). 
Poincaré’s inequality then states that 


IIfllza~oy = lV Fle, f € Ho (D), 


defines an equivalent norm on H{(D). With respect to this norm, Hj (D) is again a 
Hilbert space, this time with respect to the inner product 


(A A) uty = (VAIVA2)2. 


Theorem 11.35 (Poisson problem, Dirichlet boundary conditions). If D is bounded, 
then for every f € L’(D) the Poisson problem (11.13) admits a unique weak solution 
ue Hj (D). Moreover, there exists a constant C > 0, independent of f, such that 


lll aio) < Clif le 


Proof By the Cauchy—Schwarz inequality and Poincaré’s inequality, the linear map- 
ping L: g++ fy gf dx is bounded from Hj (D) to K: 


L(g) < Iigllallflle < CllVgllall fll2 = Cliislllga wy fll2- 


Therefore L defines a bounded functional on Hj (D). Hence, by the Riesz representation 
theorem, there exists a unique u € Hd (D) such that 


L(g) = (glu)atioy 8 € Ho (D), (11.15) 
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and it satisfies Illa (oy = ||L|| < Cl|f|l2. Writing out the identity (11.15), it takes the 
form 


[Ye Vude= ff efar, gc Hj4(D). (11.16) 
D D 


In particular it holds for all g € C2(D), since C2(D) is contained in H}(D). Replacing 
g by g and taking conjugates on both sides, we see that u is a weak solution of (11.13). 
If v € Hj (D) is another weak solution, then 


| (Vu—Vv)-Vodx=0, @€C2(D). 
D 
Since C2(D) is dense in Hj (D), it follows that 
| (Vu—Vv)-Vgdx=0, g€H(D), 
D 


and applying this to 8 gives ((u— V8) nu (p) = 9 for all g € Hj} (D). This implies u—v =0 
in H4(D). 


We prove next that the weak solution actually belongs to H{(D)M H;,.(D), where 
H?,.(D) is the space of all f € L},,(D) with the property that f|y € H?(U) for all open 
sets U € D. Defining the space ee (D) similarly, this will follow from the following 
lemma. 


Lemma 11.36. If f € H'(D) admits a weak Laplacian in L?,.(D), then f € Hj,,(D). 


Proof Let U,U' be bounded open sets such that U € U’ € D, and let w € C2?(U") 
satisfy y = 1 on U. It is routine to check that if we view yf as an element of L7(R¢), 
it admits a weak Laplacian belonging to L*(IR¢) given by the Leibniz formula 


d 
A(Wf) = (Aw) f+ yh+ 2 (9;¥) 8 
jz 
where gj := 0; f € L’(D) and h := Af € L(D) are the weak directional derivatives and 
the weak Laplacian of f on D, respectively; we view all terms as functions defined on 
all of R¢ by zero extension. 
Theorem 11.29 then implies that yf € H?(IR“). Since (wf)|v = flv, it follows that 
flu belongs to H*(U). In particular for all @ € C?(U) we have 


[feato)ax= (1) [ eu (so(wae, 
D D 


If U; and U2 are bounded open sets whose closures are contained in D, the weak 
derivatives up to order 2 coincide almost everywhere on U; 1 U2. Gluing them together 
gives the result. 
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Theorem 11.37. Let D be bounded. The weak solution u of the Poisson problem —Au = 
f with f € L’(D), subject to Dirichlet boundary conditions, belongs to H}(D)NH7,,(D). 


Proof The very definition of a weak solution implies that u admits a weak Laplacian 
belonging to L?(D), given by Au = —f. The result now follows from the lemma. 


Remark 11.38. If D has a C?-boundary, it can be shown that the weak solution belongs 
to H7,,(D). The proof of this fact lies beyond the scope of this work. 


The next proposition shows that a function in Ht (D) solves the Poisson problem with 
Dirichlet boundary conditions if and only if it minimises a certain energy functional. 


Theorem 11.39 (Variational characterisation). Let D be bounded and let f € L?(D). 
For a function ug € Hj (D) the following assertions are equivalent: 


(1) uo is the weak solution of the Poisson problem —Au = f on D subject to Dirichlet 
boundary conditions; 
(2) uo minimises the energy functional E : H4(D) — R defined by 


1 we 
E(u) := >| IVul?dr—Re | uf dx. 
2J/D D 
Proof We use the notation 
a(u,v) =} Vu-Vvdx, L(u) = uf dx. 
D D 
With this notation, for all t € IR and uj,u € H} (D) we have 
1 
E(ug +tu) = E(uo) +tRe(a(u,uo) — Lu) + 5h au, u). (11.17) 


(1)=(2): Suppose that wo is a weak solution, that is, uw € Hj (D) and a(@,uo) = L(@) 
for all ¢ € C2(D). By density, this identity extends to arbitrary ¢ € Hj (D). Applying 
the identity with @ = u and taking t = 1 in (11.17), for all nonzero u € Hi (D) we obtain 

1 
E(ug + u) = E(uo) + a(u,u) > E(uo), (11.18) 
and, by Poincaré’s inequality, the inequality is in fact strict. It follows that uo is a min- 
imiser of E in H}(D). 

(2)=(1): Suppose conversely that up minimises E in H} (D). The identity (11.17) 

implies that for all uw € Hj (D) the function t +> E(uo + tu) is differentiable in ¢ and 


0 E(uo + tu) = Re(a(u,uo) — Lu). 


dt 1=0 
Over the real scalar field this implies that a(u,uo) — Lu = 0. Over the complex scalar 
field we apply the preceding identity with u replaced by iu to find that also Im(a(u, uo) — 
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Lu) = 0, and again it follows that a(u, uo) — Lu = 0. In both cases we conclude that ug 
is a weak solution. 


The existence and uniqueness of a weak solution implies that for each f € L*(D) the 
energy functional 


He >| IVul?dr—Re | ifae 
2 JD D 
has a unique minimiser. In the above proof, uniqueness was reflected by the strictness 
of the inequality in (11.18). 
Theorem 11.39 is a special case of a more general result on the existence and unique- 


ness of minimisers for suitable nonlinear functionals defined on Hilbert spaces; see 
Problem 11.31. 


11.2.b Neumann Boundary Conditions 


Having dealt with Dirichlet boundary conditions, we shall now take a look at the Poisson 
problem with Neumann boundary conditions: 


—Au =f onD, 
3 (11.19) 
yy =0 ondoD, 


where gu := Vu-v is the partial derivative in the direction of the outward normal v 
along 0D and f € L?(D) is a given function. The treatment of this example in higher 
dimensions requires some familiarity with standard techniques from partial differential 
equations, as the notion of an outward normal is meaningful only under some regu- 
larity assumptions on the boundary @D. For the present treatment C!-regularity of the 
boundary suffices. 

To motivate the notion of weak solutions we need Green’s theorem: Jf D is bounded 
with C!-boundary, then for all u € C?(D) and v € C!(D) we have 


[vu Wvar=— | vaude+ f jo as. 
D D ap OV 


where dS is the normalised surface measure on 0D. We forget the boundary condi- 
tion for a moment and ask ourselves which information is conveyed by the integrated 
equation 


[vuvoar= | foas (11.20) 


if it is to hold for all @ € C*(D) (and not just for all C?(D), since that would ignore 
what happens at the boundary). By Green’s theorem, 


[oanar— [05 ,05=— | Foe (11.21) 
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Since we wish to solve —Au = f, we substitute this relation into (11.21) and find that 


Ou 
=—dS=0. 
fe p Ov 
This can only hold for all @ € C*(D) if 
ra) 
ss =0 onoD, 
that is, if Neumann boundary conditions hold. 


By considering @ = 1 we see that the identity (11.20) can only hold for all @ € C”(D) 
if f satisfies the compatibility condition 


[fav=o. 


More generally the integral of f against every locally constant function should vanish. 
Likewise, solutions of (11.20) cannot be unique: if u is a solution (in whatever weak 
sense), then also u+C is a solution, for any locally constant function C. In order to 
simplify matters we henceforth assume that D is connected, so that the only locally 
constant functions are the constant functions. Under this assumption we get rid of inte- 
gration constants by imposing the constraint 


[uar=o, 
D 


that is, the average of u over D should vanish. Accordingly we define 


H1(D):= {u €H'(D): | udx = o}. 
D 
This discussion leads to the following weak formulation of problem (11.19). 


Definition 11.40 (Weak solutions). Let D be bounded and connected. A function u € 
H},(D) is called a weak solution of the Poisson problem (11.19) if 


[vuvoa= | foax, ¢ €C*(D). 


Our treatment of the Poisson problem with Dirichlet boundary conditions crucially 
depended on the Poincaré inequality for Hi (D). The treatment of Neumann boundary 
conditions proceeds analogously, the role of H{(D) being taken over by H\(D). We 
will prove a version of the Poincaré inequality for this space in Theorem 11.42. Its 
proof depends on the following compactness result. 


Theorem 11.41 (Rellich-Kondrachov compactness theorem). If D is bounded and 1 < 
p<, then: 


(1) the inclusion mapping from Wo PD) into LP (D) is compact; 
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(2) if D has C!-boundary, the inclusion mapping from W!? (D) into LP (D) is compact. 


Proof We first prove (1) and deduce (2) from by means of the extension theorem. 

(1): We must show that the unit ball of Wo ? (D) is relatively compact in L’?(D). By 
extending the elements of Wo PD) identically zero outside D we may view this unit ball 
as a bounded subset, which we denote by B, of L? (R*), and it suffices to prove that this 
set is relatively compact in L?(IR“). For this purpose we use the Fréchet-Kolmogorov 
theorem, or rather, its corollary for bounded domains (Corollary 2.36). According to 
this theorem we must check that 


lim sup ||tif— f||p = 9. (11.22) 
|h|0 feB 


Here, 7;,f is the translate of f over h, that is, tf (x) = f(x +A). To prove this, first let 
f € BNC2(D) and extend f to identically zero to all of R@. For r > 0 let D, := {x ER: 
d(x,D) <r}. If |Al < sr, then by Hélder’s inequality, Fubini’s theorem, and a change 
of variables, 


[i lut—sirde= fo |pe+n)— fly lrar 
R D1, 


<[ ics |< ple ten)| ar)” ax 


cf, [lgrevmfas 


1 
< nj? | | \Vf(-+th)|? dt dx 
dD, 0 
ar 
1 
= ay [| IV f(x-+th)|? det 
0 YD. 
pied 
1 
< jal ff ivro)lravat 
0 JD, 
= al? [IVF 0)! dy = lal lvgioy <A 


keeping in mind that || f||y, 1p) S 1 since f € B. The above estimate holds for any f € 


BNC? (D). Since C?(D) is dense in Wo PD) this estimate extends to arbitrary f € B. 
This proves that if |h| < 57, then 


sup || tif — flip < A 
CB 


and (11.22) follows. 
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(2): Let D € D’, where D! C R¢ is some larger bounded domain. Let y € C?(R“) 
be compactly supported in D’ and satisfy y= 1 on D. As in Theorem 11.28 let Ep : 
w!(D) — W!?(R“) be a bounded extension operator, let My : W!'?(R¢) > Wo ?(D') 
be the multiplier given by f > wf (we use Proposition 11.22 to see that this operator 
maps into Wo (D') as claimed), let ip : Wo (D') + L?(D’) be the inclusion mapping 
(which is compact by Step 1), and let Rp p : L?(D') > L?(D) the restriction operator 
f+ f\p. Then the inclusion mapping jp : W'?(D) > L?(D) factors as 


JD = Rp'p Olp oMyoEp 


and is therefore compact. 


As an application of the Rellich-Kondrachov theorem we have the following variant 
of Poincaré’s inequality where we define 


WyP(D):={wew'r(D): [ wdx=o}. 


Theorem 11.42 (Poincaré—Wirtinger inequality). Let D be bounded and connected with 
C!-boundary. Let 1 < p < ©. Then there exists a constant C = Cp,p such that for all 
fe Way? (D) the following estimate holds: 


IIflle < CIV Ally. 


In particular, IF lll: wy = ||Vf]|p defines an equivalent norm on Way? (D). 


Proof We argue by contradiction. If the theorem were false we could find a sequence 
(fn)n>1 in Wy? (D) such that || frl|p > 2||Vfallp for n = 1,2,... By scaling we may 
assume that || fn ||) = 1, so that ||Vfnl|p < 4. 

Since (fn)n>1 is bounded in wh? (D) we may use the Rellich-Kondrachov theorem 
to extract a subsequence (fh, )k>1 Of (fn)n>1 that converges, with respect to the norm of 
L?(D), to some f € L?(D). Also ||V fi, ||p < x — 0 as n — , and therefore the closed- 
ness of V as an operator from L’(D) to L?(D;IK“) implies that f € D(V) = W!?(D) 
and Vf = 0. In view of 


J fov= jim [ fide =o (11.23) 
D ke JD 
we even have f € Way? (D). But Vf = 0 implies, via Proposition 11.16, that f is a 


constant almost everywhere. In view of (11.23) this is only possible if f = 0 almost 
everywhere. We thus arrive at the contradiction 0 = ||f||) = lim, || fnz |p = 1. 


We are now in a position to solve the Poisson problem with Neumann boundary 
conditions. 
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Theorem 11.43 (Poisson problem, Neumann boundary conditions). Let D be bounded 
and connected with C!-boundary. For every f € L?(D) satisfying 


[far=0 


the Poisson problem (11.19) admits a unique weak solution u € H},(D). Moreover, there 
exists a constant C > 0, independent of f, such that 


Ile4ll 42, (vy <C||f\l2- 


Proof The argument follows the proof of Theorem 11.35 with minor adjustments. 
By Theorem 11.42, 


lIslley,(o) *= IIVall2 
defines an equivalent norm on H.(D). This norm arises from the inner product 
(8/4) a4, (o) = (V8|VA)2. 


In the rest of the proof we shall consider H},(D) with this norm. 
By the Cauchy—Schwarz inequality and the Poincaré—Wirtinger inequality, the linear 
mapping L: g++ Jy gf dx is bounded from H},(D) to K: 


IL(s)| < Ilsllalflle < ClVsll2llflle = Cllallag oyllfll2- 


Therefore L defines a bounded functional on H},(D). Hence, by the Riesz representation 
theorem there exists a unique u € H},(D) such that 


L(g) = (sl4))uyo)> 8 € Hav(D), 


and it satisfies ||v||-1 (p) = ||L|| < C||f||2- Writing out this identity, it takes the form 


[9 Vuar= ff efar, gc Hi (D). (11.24) 
D D 


For an arbitrary g € H'(D) we may write g = m+(g—m) with m:= Jp gdx. Then 
g—m€ H} (D). Since (11.24) also holds with g replaced by the constant function m, 
it follows that (11.24) holds for all g € H'(D). In particular it holds for all g ¢ C*(D), 
since such functions belong to H!(D). Taking conjugates on both sides we see that u is 
a weak solution of (11.19). 

If v is another weak solution, then 


[(u-¥v)-Vodr=0, 6 €C°(D). 


Since C*(D) is dense in H'(D) by Theorem 11.27, ((u— V8) a1,(p) = O for all g € 
H},(D). This implies that u—v = 0 in H},(D). 
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The analogue of Theorem 11.37 holds, with the same proof: 


Theorem 11.44. Let D be bounded and connected with C!-boundary. The weak solu- 
tion of the Poisson problem —Au = f with f € L” (D), subject to Neumann boundary 
conditions, belongs to H'(D)N € Hp,.(D). 


If D has C?-boundary, the weak solution u can again be shown to belong to H?(D). 
A variational characterisation of weak solutions in the spirit of Theorem 11.39 can 
again be given: 


Theorem 11.45 (Variational characterisation of the solution). Let D be bounded and 
connected with C!-boundary, and let f € L?(D). For a function ug € H},(D) the follow- 
ing assertions are equivalent: 


(1) uo is the weak solution of the Poisson problem —Au = f on D subject to Neumann 
boundary conditions; 
(2) uo minimises the energy functional E : H'(D) — R defined by 


E(u) := 5 [\uPax—Re f ufar. 


Proof The proof is very similar to that in the case of Dirichlet boundary conditions. 
We use the notation 


a(u,v) =} Vu-Vvdx, L(w) =) uf dx. 
D D 
With this notation, for all t € R and ug,u € H! (D) we have 


E(uo +tu) = E(uo) +tRe(a(u,uo) — Lu) + sPa(usu). (11.25) 


(1)=(2): Suppose that up is a weak solution, that is, uw € H},(D) and a(@,uo) = 
L(@) for all @ € C*(D). By density, this identity extends to arbitrary @ € H'(D) by 
approximation. Applying the identity with @ = u and ¢ = | in (11.25), for all nonzero 
u € H'(D) we obtain 


1 
E(uo+u) = E(uo) + zaluu) > E(uo), 
and, by the Poincaré—Wirtinger inequality, the inequality is strict if u ¢ H},(D). It fol- 


lows that uo is a minimiser of E in H!(D). 


(2)=(1): Suppose conversely that uo € H},(D) minimises E in H'(D). The identity 
(11.25) implies that for all w € H},(D) the function +4 E(u + tw) is differentiable in 
and 


0= a E(uo + tu) = Re(a(u,uo) — Lu). 
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Over the real scalar field this implies that a(u,uo) — Lu = 0. Over the complex scalar 
field we apply the preceding identity with u replaced by iu to find that also Im(a(u, uo) — 
Lu) = 0, and again it follows that a(u, uo) — Lu = 0. In both cases we conclude that uo 
is a weak solution. 


11.2.c The Elliptic Problem Au — Au = f 


The results of the preceding two sections admit straightforward modifications for the 
elliptic problem 


Au—Au= f 


for ReA > 0 and f € L?(D), subject to Dirichlet or Neumann boundary conditions. 
As always we assume that D is open and bounded in R%, and in the case of Neumann 
boundary conditions we furthermore assume that D is connected and has C!-boundary. 

To treat the case of Dirichlet boundary conditions we define a weak solution to be a 
function u € Hj (D) such that 


| Aud + Vu-Vodx = | fodx, @€CZ(D). 
D D 
Repeating the steps of the proofs of Theorem 11.35 and 11.39 one obtains: 


Theorem 11.46 (Elliptic problem, Dirichlet boundary conditions). [fD is bounded, then 
for all Red > 0 and f € L?(D) the elliptic problem Au— Au = f subject to Dirichlet 
boundary conditions admits a unique weak solution. For 1 > 0 this weak solution is the 
unique minimiser of the energy functional E : Hi (D) — R defined by 


E(u) := 5 | \uP+ Alultar—Re f ufar. 


In the case of Neumann boundary conditions, the presence of additional term Au 
has the effect of simplifying the heuristic reasoning motivating the definition of a weak 
solution in Section 11.2.b, in that the averaging condition is no longer needed. Repeating 
the argument, it is found that a weak solution should now be defined to be an element 
u € H'(D) such that 


a | Lea VORE | fodx, ¢€C°(D). 
D D 
Repeating the steps of the proofs of Theorem 11.43 and 11.45 one obtains: 


Theorem 11.47 (Elliptic problem, Neumann boundary conditions). If D is bounded 
and connected with C!-boundary, then for all Re A > 0 and f € L? (D) the elliptic prob- 
lem Au— Au = f subject to Neumann boundary conditions admits a unique weak so- 
lution. For 1 > 0 this weak solution is the unique minimiser of the energy functional 
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E : H}(D) > R defined by 


Blu) := 5 [ Iu? +AluPax—Re fafa 


We limit the present treatment of the elliptic problem to the above two theorems. In 
the next two chapters we will develop powerful techniques that allow us to give precise 
L?-estimates for the solutions u in terms of the data f and to extend these estimates to 
L? forl<p<o. 


11.3 The Lax—Milgram Theorem 


The considerations of the previous section depended crucially on the use of the Riesz 
representation theorem as an abstract tool to prove the existence and uniqueness of 
solutions. This technique can be generalised to more general classes of boundary value 
problems by using a more flexible version of Riesz representation theorem, the so-called 
Lax—Milgram theorem. 


11.3.a The Theorem 


In what follows, V is a Hilbert space. The reason for using the letter V is that in appli- 
cations, typical choices are V = Hj (D) and V = H'!(D), where D is an open subset of 
R¢. In such settings the letter H will be reserved for the space L?(D). In order to prevent 
possible confusion, the inner product and norm of V will be denoted by (-|-)y and ||-|ly, 
respectively. 


Definition 11.48 (Forms). A form on V is a sesquilinear mapping a: V x V > K. A 
form a on V is called bounded if there exists a constant C > 0 such that 


Ja(u,v)| <Cllully|ivilv, wvev. 


In the language of forms, Proposition 9.15 asserts that if a: V x V — K is a bounded 
form, then there exists a unique bounded operator A on V such that 


a(v,v’) = (Av|v’)y_ for all v,v’ € V. 
Moreover, ||A||y < C, where C is the boundedness constant of a. 
Definition 11.49 (Accretive and coercive forms). A form a on V is called accretive if 
Rea(v,v) >0, veV, 
and coercive if there exists a constant @ > O such that 


Rea(v,v) > allvlle, vev. 
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For bounded coercive forms we have the following version of Proposition 9.15. 
Theorem 11.50 (Lax—Milgram). [fa is a bounded coercive form on V, then: 


(1) the bounded operator A associated with a is boundedly invertible and |A7! llv< 
a! where o is the coercivity constant of a; 
(2) for every bounded functional L : V + K there exists a unique v' € V such that 


L(v) =a(v,v’), VE V. 
Moreover, |\v'\|v < \|A7'||v||ZI|- 


Proof We proceed in two steps. 


Step I — Let A be the bounded operator provided by Proposition 9.15. The estimate 
av? < Rea(v,v) < Ja(v,v)| = |(Av|y)v | < [4vly [lvl 


implies that a||v||y < ||Av||y for all v € V. From this we infer that A is one-to-one and 
has closed range R(A) in V (see Proposition 1.21). The operator A is also surjective, for 
otherwise there exists a nonzero element v> € (R(A))+ and we arrive at the contradic- 
tion 

0< ally" lly <Rea(v",v") =Re(Av"|v")y =0. 


By the open mapping theorem, A has bounded inverse. The estimate ||v||v < ||Av|lv 
now implies that ||A~!||y < a7 

Step 2 — Given a bounded functional L : V — K, by the Riesz representation theorem 
there exists a unique vo € V such that L(v) = (v|vo)y for all v € V. Moreover, it satisfies 
\|vollv = ||L||. Since A, and hence A®* is boundedly invertible, there exists a unique v’ € V 
satisfying A*v’ = vo. Then 


L(v) = (v|vo)v = OA*V')y = (Avy =alv,”), ve, 
and 
IIV'Ilv < (A*)" "lv lvolly = AT Iv ZI 


This proves the existence part as well as the estimate for the norm of v’. To prove 
uniqueness, suppose that also L(v) = a(v,v’) for some v” € V and all v € V. Then 
a(v,v’—v") = 0 for all v € V. Taking v = v’ —v", coercivity gives 0 < all —v" ||? < 


Rea(v —v",v’—v") =0. This implies v’ = v”. 


Part (2) of the theorem provides a generalisation of the Riesz representation theorem 
with the inner product replaced by a bounded coercive form a. If a is symmetric, that is, 
a(v,v’) = a(v’,v) for all v,’ € V (some authors refer to this as a being Hermitian), then 
a(v,v’) defines an inner product on V generating an equivalent norm. In this situation 
the Lax—Milgram theorem is an immediate consequence of the Riesz representation 
theorem. The principal interest in the theorem lies in the nonsymmetric case. 
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11.3.b The Sturm-Liouville Problem 


As a first application of the Lax—Milgram theorem, generalising the results on the Pois- 
son problem we shall consider the Sturm—Liouville problem with Dirichlet boundary 
conditions on a nonempty bounded open subset D of R¢: 


a =f onD, (11.26) 


ulap = 0, 
where we make the following assumptions: 


e the function f belongs to L7(D); 

e the matrix-valued function a: D — Mg(IK) has bounded measurable coefficients and 
is coercive in the sense that there is a constant @ > 0 such that for almost all x € D 
we have 


d — 
Re Y ajj(x)GiE, > aE, € EK; 
i,j=l 
e the function g: D> Kis bounded and measurable and satisfies Re g(x) > 0 for almost 
allx € D. 


A function u € Hd(D) is called a weak solution of (11.26) if for all @ € C?(D) we 
have 


[avu-voar+ [ quoar=[ foae. 


Theorem 11.51 (Sturm—Liouville problem). Let D be bounded. Under the above as- 
sumptions on a, q, and f, (11.26) admits a unique weak solution u in Hi (D). Moreover, 
there exists a constant C > 0 independent of f such that 


ell ay <Cllflle 


Proof The proof is a straightforward adaptation of the proof of existence and unique- 
ness of the Poisson problem with Dirichlet boundary conditions. This time we apply the 
Lax—Milgram theorem to the form a: Hi (D) x H{(D) > K, 


a(u,v) = ‘| avu-Vvdx+ | quv dx, 
D D 
which is bounded and coercive: boundedness follows from 
|a(u,v)| < |la]}-o|[Vull2||Vvll2 + |lalleollellall vil 
< 2max {allo lldleo}lellaacoy|l¥llaaoy 


and coercivity from 


Rea(v,v) = Re | 


aVv-Wodr+Re | q\v|? dx 
D D 
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> [ Reavy. Vax > afl Vv|3 = el] WIZ) 


where |||v||| H3(D) = || Vv||z is the equivalent norm on Hj (D) considered before. 


The case of Neumann boundary conditions can be handled similarly and is left to the 
reader (see Problem 11.30). 


Problems 


11.1 Show that for all f € L?(0,1) with 1 < p < © the function 


x= [/FO)a, xe (0.1), 


belongs to W!(0,1) and its weak derivative is given by I, = f. Moreover, the 
mapping f +> I from L?(0,1) to W!?(0, 1) is bounded. 
11.2 Let f €W!(0,1) with <p<o., 


(a) Show that f is equal almost everywhere to a continuous function fe C(0, 1] 
such that for all x € [0, 1] we have 


0)+ [ fo)dy 


(b) Show that the resulting mapping f +> f from W!-?(0, 1) to C[0, 1] is bounded. 
(c) Show that a function f € W!: (0, 1) with 1 < p <e belongs to Wo(0, 1) if and 
only if its continuous version f (see Problem 11.2) satisfies f(0) = f(1) =0. 
11.3. Give a direct proof that C*[0, 1] is dense in W! (0,1) for all 1 < p<. 
11.4 Fix 1 <p<oand f €W!?(0,1), and let f € C(0, 1] be its continuous version 
(see Problem 11.2). 


(a) Suppose that f(0 ) = 0. Show that x fa) ) belongs to L?(0,1) with 
f(x) | <= 
x Ip 


[ 


P 
“If. 
Hint: Use Young’s inequality for (Ry, ) from Problem 2.25 with 


i x'/P f'(x), x € [0,1], 
r= {3 x € (1,2). 


(b) Suppose that x4 fa) belongs to L?(0, 1). Show that f(0) = 
Hint: Argue by Sentodiceon: 
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(c) Define 
1 


eas ae x € (0,1). 
Show that f € W!!(0,1) and f(0) =0, but 2 ¢ 2!(0,1). 


11.5 Determine whether the function f € L'((—1,1) x (—1,1)) given by 
fy) =a Gy) €(-1,1)) x (-1,)) 


has weak derivatives of order one. If ‘no’, provide a proof; if ‘yes’, compute the 
weak derivatives 0; f and 02f. 
11.6 Let D be bounded and fix 1 < p<. Letr€ R satisfy r > 1 — a. 


(a) Show that the function f(x) := |x|" belongs to W!?(D), and compute its 
weak partial derivatives. 

(b) Let {x, : n> 1} be a countable dense set in D. Show that the function g(x) := 
Ln>1 mn |x —x,|" belongs to W!?(D). 

(c) Show that if d > 2 and 1 — d <r <0, then g is unbounded on every open 
subset of D. 


11.7 For 1 < p < © we consider the weak derivative 0 a linear operator in L? (0, 1) 
with domain D(d%) := C3(0, 1). 
(a) Show that d is closable as an operator in L?(0, 1). 
(b) Show that the domain of the closure of d equals A ?(D). 
(c) Show that this closure has a proper closed extension, given by the weak 
derivative with domain W!?(0, 1). 
(d) Why doesn’t this contradict the result of Proposition 10.30? 


11.8 Let f EL}, (D). 


(a) Show that if f admits a weak gradient, then it admits weak derivatives 0;f 
for all j = 1,...,d. 

(b) Show that if f admits a weak Laplacian, then it admits weak derivatives 0; f 
for all j = 1,...,d. 

(c) Show that if f € L?,,(D) and f admits a weak Laplacian in L?,.(D), then 
f € H,(D). 
Hint: First check that the weak derivatives 0; f belong to Hy,.(D) and use 
this to prove that yf has a weak Laplacian in bee (D) for every test function 
w € CS(D). Then use Theorem 11.29. 


11.9 Show that if f ¢ W!?(D) with 1 < p <, then Vf =0 almost everywhere on the 
set {x € R47: f(x) =O}. 
Hint: In the real-valued case, Vf = V(ft)—V(f_). 

11.10 Is H!(D) a Banach lattice? 
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11.11 Show that if a real-valued function f € L|,.(D) admits weak derivatives 0;f and 
p :R- K is a C!-function with bounded derivative, then p o f admits weak 
derivatives given by 


0j(pof) =(p'of)djf. 


11.12 Let1 < p,q,r<~ satisfy ate = 1. Prove that if f € W*?(D) and gE W*4(D), 
then fg € W*"(D), and for all multi-indices @ with |a| < k we have the Leibniz 
formula 


a = 
are= ¥ (f)@P Ne) 
0<B<a B 
11.13 Does Theorem 11.41 extend to the case p = 0%? 
11.14 Consider the inclusion mapping f > f from wl? (0,1) to C[0, 1] of Problem 
11.2. In what follows we let 1 < p<. 


(a) Show that there is a constant C > 0 such that || f\l.o0 < C||f|lw1.0(0,1) for all 
few'(0,1). 

(b) Show that the inclusion W!?(0, 1) C C0, 1] is compact. 
Hint: Use the Arzela—Ascoli theorem. 

(c) Show that the inclusion W!:!(0, 1) C C[0, 1] fails to be compact. 


Hint: Approximate 1 ) pointwise by a sequence of piecewise linear func- 


ml 

3 
tions which is Cauchy in W!'!(0, 1). 

11.15 Show that the inclusion mapping W!?(IR) C L?(R) fails to be compact. 

11.16 Let fe W!(D). 

(a) Suppose that 7 € L!(iR¢) has support in the unit ball B(0; 1) of R¢ and satis- 

fies fea M(x) dx = 1. For € > 0 denote ne(x) := €-4n(e~!x). Show that the 
convolution fe := Ne * f satisfies the pointwise bound 


IVF) <MMfllwimw), *€ Dey 
where De := {x € D: d(x,0D) > €}. 
(b) Show that if D is convex, then 


[fe(x) — fe) <lIfllwte~ye yl, ty € De. 


(c) Deduce that if D is convex, then for every f € W!*(D) there exists a Lip- 
schitz continuous function g : D— K such that f = g almost everywhere, 
with Lipschitz constant Ly = || f||y1.(p)- 

(d) Show that the result of part (c) fails for the nonconvex open set in R? obtained 
by removing the nonnegative part of the x-axis from B(0;1). 
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11.17 Show that if f € Wy? (D) with 1 < p < and D’ is an open set containing D, 
then the function f : D’ + K defined by 


om f(x), x ED, 
f(x) = y 
0, x€D'\D, 
belongs to Wj”?(D’). 
11.18 Show that there exists a constant C > 0 such that for all f € C2(R“) we have 
1/2 
Vile < cll Nasi’ 


Show that this inequality extends to functions f € W??(R?). 
Hint: Start by showing that ||Vf||2 < C(||f|l2 + ||Af||) for some (possibly differ- 
ent) constant C > 0. Then apply this inequality with f(cx) in place of f(x) and 
optimise over c > 0. 

11.19 For h € R¢ let 


Di f(x) = +(flatte|)— fly), 1< 7<d, teR\ {0}, 


where e; is the jth standard unit vector of R?. 


(a) Prove that if f ¢ W!?(R¢) with 1 < p < ~, then 
Dif lly <l0ifllp, 1<i<d,teR\ {0}, 


where 0;f denotes the jth partial derivative of f. 
(b) Prove that if there exists a constant C > 0 such that for all f € L?(R“) we 
have 1 < p<, then 
ID5 fllp <C, l<j<d,teR\ {0}, 
then f ¢ W!?(R¢) and ||0;f\|p < C for all 1 < j<d. 


11.20 Let 1 < p < ©. The aim of this problem is to show that for all f ¢ W'?(R) we 
have f’ = lim,_59D;,f in L?(R), where 


Dif (x) = ferns DO vg R,A£0. 
(a) Let h 4 0. Show that 
Thf (x yee fl” f(t) xER, 


defines a bounded oe on LP? (R) of norm ||Th|| < 
Hint: Show that T;,f = ;,1 a)(-$ -)* f, where « ak the convolution 
product, and use Young’s ee. 

(b) Show that for all f € C! (IR) we have f’ = limy,_,9 Daf in L?(R). 
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(c) Deduce that for all f € W!-?(IR) we have f’ = limo D;,f in L?(R). 

11.21 Show that for all s > 0 the norm given by (11.9) turns H°(R®) into a Hilbert 
space. 

11.22 Using Fourier analytic methods, prove the following special case of the Sobolev 
embedding theorem: If k > d/2, then every f € H*(IR“) is equal almost ev- 
erywhere to a function belonging to Co(IR’). Moreover, the inclusion mapping 
H*(R4) C Co(R) is continuous. 

11.23 The aim of this problem is to prove another special case of the Sobolev embed- 
ding theorem. By completing the following steps, show that if d < p < , then 
every f € W!?(D) is equal almost everywhere to a continuous function on D. 

(a) First let D = R“. Show that if f € Cl(R“) and the function 7 € Cl(—1,1) 
satisfies 7 (0) = 1, then 


oo d 
£0) = ff gayyy ML 93f 09) + Flea) 0) A510) ar, 


j=l 


where S is the surface measure, and hence 
1 
YOl<cf —alIVF@)+ Feo ae 
B(O:1) |x| 


for some constant C > 0 independent of /f. 
(b) Apply Holder’s inequality to obtain the bound 


|f(0)| < C'Ilf\lwiocee) 


for some constant C’ > 0 independent of f. 
(c) By translation, conclude that 


IIflleo <C'llfllwueceay: 


Use a density argument to prove that if f ¢ W!-?(R¢), then it is equal almost 
everywhere to a bounded continuous function. 
(d) For general domains use a localisation argument. 
11.24 This problem is a continuation of the preceding one. 
(a) Show that if 1 < p,q < © satisfy ds 7 *) < 1, then every f € W!?(R“) 
belongs to L4(IR“) and 


If llacea) S Clif llwrcee) 


for some constant C > 0 independent of /f. 

Hint: Starting from the formulas of the preceding problem, use a translation 
argument in combination with Young’s inequality (see the hint of Problem 
11.20). 
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(b) By repeatedly applying the inequality of part (a), deduce an embedding result 
for functions in W*? (R¢) into the space of bounded continuous functions. 
11.25 Consider the Green function on the unit interval [0, 1] (see Section 11.2.a): 
(1—x)y, O<y<x, 
k(x,y) = 
(l—y)x, x<y<l. 


(a) Show that the associated integral operator 


Tile) = [ keea)fbo)4y 


on L7(0,1) is compact and has eigenvalues 1/(n7)?, n = 1,2,3... with cor- 
responding eigenfunctions x +> sin(n7x). 


Let now f € L’(0,1) be given and define u € L”(0, 1) by 
u(x) := Tf (x), x € (0,1). 


(b) Show that u € H4 (0, 1). 
(c) Show that u is a weak solution of the Poisson problem with Dirichlet bound- 
ary conditions 


i =f on (0,1), 
u(0) =u(1) =0. 


11.26 In this problem we consider the Poisson problem with Dirichlet boundary condi- 
tions on the unit disc D = B(0;1) in R?: 


fee on D, 


(11.27) 
ulap = 0. 


If f € L’(D), we know from Theorem 11.35 that (11.27) has a unique weak so- 
lution u € Hd(D). The aim of this problem is to show that, even for functions 
f €C.(D), (11.27) may not admit a classical solution u € C?(D)NC(D). 

Define v: B(0; +) \ {0} + R by 


v(x, y) = (x —y?) log log ((? +y)1”) 


(a) Show that v € C?(B(0; 5) \ {0}) and compute v,, vy, Vex, and vyy. 
(b) Show that 


, (x,y) € B(O; 5), 


lim v(x,y)= lim »v(x,y)= lim yi(x,y) =0. 
Gey P= Gy ptt gy PEP = og ht gy PO? 


Conclude v can be extended to a function in C!(B(0; 5): 


Hint: For € > 0 one has |logs| < s~* for s small and |logs| < s® for s large. 
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(c) Show that Av: B(0; 5) \ {0} — R has a continuous extension to B(0; 5)s 


which we denote by g: B(0; 5) — R. Moreover, show that 


he 


Hint: Use Green’s theorem on B(0; 5)\B(0;e) with € > 0 and let € | 0. 

(d) Let 7 € C2(D) be compactly supported in B(0; 5) with n = 1 on B(0; i): 
Show that uw := —nv belongs to H4(D) and is the weak solution of (11.27) 
with f := ng +2Vn-Vv+(An)v. 

(e) Show that lim,.o vx.(x,x) = cc and deduce from this that u ¢ C*(D). Con- 
clude that (11.27) does not admit a classical solution u. 

Hint: Use Theorem 11.35. 


= 27 p/¢. 1 
wagar= fl sede  €C2(B(0;4)). 


2 


2 


11.27 Prove Theorems 11.46 and 11.47. 
11.28 The aim of this problem is to solve the Poisson problem with inhomogeneous 


boundary conditions 


ae Ou (11.28) 


Ulap =8;, 


where D C R¢ is bounded and f € L?(D) and g € C(dD) are given functions. 
We assume the function g admits an H'(D)-extension, by which we mean that 
there exists a function g € H'(D)MC(D) such that glap = g. Under these as- 
sumptions, a function u € H!(D) is called a weak solution of the Poisson problem 
with Dirichlet boundary conditions (11.28) if 


[vuvou= [ foas, $ €Ce(D), 


and u— g € Hd (D). The condition wu — g € H4(D) is the rigorous way to express 
the boundary condition u|gp = g. 


(a) The condition u— g € Hj (D) explicitly refers to the extension g. By using 
Theorem 11.24, show that the fulfilment of this condition does not depend 
on the particular choice of the extension. 

(b) Prove that for every f € LV? (D) the Poisson problem (11.28) has a unique 
weak solution wu € H!(D). 

Hint: For u € Hd (D), show that 


L(u) = | ufar— [ vuVean 


defines a bounded functional on Hj (D) and apply the Riesz representation 
theorem. 
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11.29 Let D be a bounded and let g € C(0D) be arbitrary. Let u be a classical solution 
g of the Dirichlet problem, that is, the problem (11.28) with f = 0. Prove that the 
following assertions are equivalent: 


(1) whas finite energy, that is, [,|Vu|* dx < 0; 
(2) wis a weak solution; 
(3) g has anH extension. 
Hint: For the proof of (3)=>(2), let D’ € D. The function g’ := u|py has an H!(D’)- 
extension, given by u|p:. Hence by the result of the preceding problem, the prob- 
lem 

Av=0 onD’, 

vlap = g, 
has a unique weak solution 7 € H'(D'). Prove that 7 = u almost everywhere on 
D’ and that the restriction of #—u to D’ belongs to H}(D’). 
11.30 Discuss the Sturm—Liouville problem for Neumann boundary conditions. 


11.31 Let H be a Hilbert space and let h € H be a given element. Show that the nonlin- 
ear functional E : H — R defined by 


E(u) := 5 lll? —Re(ulh) 


has a unique minimiser by completing the following steps. 


(a) Show that E is continuous and bounded from below, that is, 


m:= inf E(u) > —°. 
ucH 


(b) Using the parallelogram identity, show that for all u,v € H we have 
1 
gllu—vll? < (E(w) — @) + (E(v) — a). 


(c) Deduce that if (u,)n>1 is a sequence in H such that limy_,.E (un) = m, then 
this sequence is Cauchy. 
(d) Prove that u := lim,_,..U, is the unique element of H minimising E. 


11.32 Let V be a Hilbert space and consider a bounded coercive form a: V x V > K. 
Let L: V > K be a bounded functional. By the Lax—Milgram theorem there is a 
unique uy € V satisfying a(v, uy) = L(v) for all v € V. 

Suppose now that W is a closed subspace of V. By the Lax—Milgram theorem, 
applied to the restriction of a to W x W, there is a unique uw € W satisfying 
a(w, uy) = L(w) for all w € W. 


(a) Show that a(uy — uy,w) =0 for all w € W. 
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(b) Show that 
|v —uw|| < CaM! inf ||uy — wll, (11.29) 
wew 


with C > 0 and & > 0 the boundedness and coercivity constants of a. 
(c) Show that if a is symmetric, that is, a(v1,v2) = a(v2,v1) for all v1,v2 € V, 
then 


\|uy —uw|| < VCa-! inf ||uy —w]. (11.30) 
wew 
The quasi-optimality estimates in (11.29) and (11.30) are known as Céa’s lemma. 


11.33 In this problem we outline an application to the so-called finite element method 
for the Poisson problem (11.13) on the unit interval (0,1) with datum f € L*(0,1): 


—Au = 0,1 
ee ont) (11.31) 
u(0) =u(1) =0. 
In what follows we endow V := H4(0,1) with the norm II lla (0,1) := ||v’|l2. By 


the Poincaré inequality, this norm is equivalent to the Sobolev norm ||v||2 + ||v’||2. 

Consider a partition z = {xo,...,xy} of the interval [0, 1], that is, we assume 
that 0 = x9 <x) < +++ <xy-1 <xwn = 1. Let Vz denote the closed subspace con- 
sisting of all v € V that are linear on each of the intervals [x,—1, xn]. 


(a) Show that there exist unique elements u € Hi (0,1) and uz € Vz such that 


a(v,u) = [ v(x)f(x)dx, ve, 
. (11.32) 


1 
a(v,u_) = i v(x)Flz)dx, ve Vp. 
0 
Using the results of Problem 11.32, prove the quasi-optimality estimate 
Iu — url (0,1) S nt ||u— Viliz8(0,1)° 
(b) Show that for all v € Hd (0,1) H7(0, 1) we have zv € H!(0,1) and 


|v — TI 74(0,1) < hallY'|In20,1); 


where zv € V; is obtained by piecewise linear interpolation of the values 
V(Xn),O <n <N and hg := maxi <n<n |Xn —Xn—1| is the mesh of 7. 
Hint: Fix x € [0,1] \ and choose 1 <n < N such that x,_1 <x < x,. Then 


V(X) — V%n-1) 
Xn — Xn-1 


(nv)'(x) = 


Problems 397 
and, since v € H?(0, 1), 
Xn 
Vin) = V(tn-a) + Onn) na) +f" (tn —y)v") dy: 
Xn-1 


Rewriting the latter as 


V'(Xn-1) alo a a v"(y)dy 


Xn — Xn-1 Xn—1 Xn — Xn-1 


and using that 7 < land |—*—_| < 1, show that 


al 
Me) Coy < [” World: 


(c) Let the assumptions of Theorem 11.35 be satisfied with d = 1 and D = (0,1), 
and let u € Hi (0,1) H7(0, 1) be the weak solution of the Poisson problem 
(11.31) (see Theorem 11.37). Prove that 


Il — wallet (oy < Aalle llz@.): 


Since the norm IIvll xa (0,1) 18 equivalent to the Sobolev norm ||v||2 + ||v’||2, the 
result of part (c) shows that uz and its weak derivative u/, provide good approxi- 
mations of u and its weak derivative uv’ in the L?(0,1)-norm if hz is small. 

The approximate solution uz can be constructed explicitly as follows. Every 
u € Vz can be written uniquely as a finite linear combination u = ys CaWn, 
where y;, € Vz is the piece-wise linear function given by the requirements 


ea 1, n=m, 
n\Xm) = 
se 0, nm, 


since these functions form a basis for Vz. By definition, uz is the unique element 
of Vz solving (11.32) which, for our boundary value problem, takes the form 


1 1 
[ videae= f vfdx, ve Vz. 


Since the functions y,, form a basis for Vz, this holds if and only if 


1 1 
[Viera = f Wifdx, n=1,...,N—-1. 


Writing uz = han; CmWm, Our task is reduced to determining the coefficients 
C1,---,¢n—1 from the equation system of N — | linear equations 


N-1 1 1 
a Vinvnds = | wafdx, n=1,...,.N—1. 
m=1 0 0 


The functions y/, take nonzero constant values on the intervals (X,—1,%n) and 
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(X%n;Xnn+1) and vanish on the remaining sub-intervals. It follows from this that 
fo wi,w, dx = 0 unless m—n € {—1,0,1}. Therefore the computation of the co- 
efficients c,, reduces to a matrix problem of the form Sc = d, where the so-called 
stiffness matrix S is the (N — 1) x (N — 1) matrix whose coefficients 


1 
Snm aa YnVin de 


vanish off the diagonal and the two neighbouring off-diagonals, and 


1 
dy = | Vase. 
0 


This problem is easy to solve with numerical methods from Linear Algebra. 


12 


Forms 


This chapter develops elements of the theory of sesquilinear forms and uses it to define 
and study certain bounded and unbounded operators, including second order differen- 
tial operators such as the Laplace operator subject to Dirichlet and Neumann boundary 
conditions. 


12.1 Forms 


In the previous chapter we proved existence and uniqueness of weak solutions of the 
Poisson problem —Au = f on a nonempty bounded open subset D C R? for functions 
f € H =L’(D) by exploiting the properties of the sesquilinear mapping a: V x V > K, 


a(u, Vv) +f Vu-Vvdx, (12.1) 
D 


where V = H!(D) or a suitable closed subspace thereof. If the matrix-valued function 
a: D— Mg(Kk) is coercive, the sesquilinear mapping 


a(u,v) > | aVu-Vvdx (12.2) 
D 


played the same role in solving the Sturm—Liouville problem. In each of these cases, 
the key ingredient was the Poincaré inequality, which can be phrased in terms of a as 


Rea(v,v) > allvllz, ve, 


where & > 0 is a positive constant. 

In order to study these matters from an abstract point of view it will be useful to 
interpret a form a defined on a subspace V of a Hilbert space H as one in H with domain 
D(a) = V, in the same way as the notion of a bounded operator has been generalised to 
that of a linear (possibly unbounded) operator A defined on a domain D(A). 
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Definition 12.1 (Forms, accretivity and coercivity). A form in a Hilbert space H is a 
pair (a, D(a)), where D(a) is a subspace of H, the domain of a, and a: D(a) x D(a) > K 
is a sesquilinear mapping. A form (a, D(a)) is called accretive if 


Rea(x,x) 20, x€D{(a), 
and coercive if there exists a constant @ > 0 such that 
Rea(x,x) > axl’, x € D(a). 


In what follows, H always denotes a Hilbert space. Definitions 11.48 and 11.49 are 
recovered in the special case D(a) = H. When no confusion is likely to arise, we omit 
D(a) from the notation and denote the form by a. 


Example 12.2. The forms in H = L*(D) defined by (12.1) and (12.2) are accretive and 
continuous on the domain D(a) = H!(D), and coercive on the domains D(a) = H} (D) 
and D(a) = H}\(D). 


If a is an accretive form in H, then 
(x|y)q == Rea(x,y)+(xly), x,» € D(a), (12.3) 
defines an inner product on D(a); here, (x|y) is the inner product of x and y in H and 
1 
Rea := z(ata") 


is the symmetric part of a, given by a*(x,y) := a(y,x). The inner product (12.3) induces 
a norm on D(a) given by 
1/2 
lalla = (xla)al” 


Warning: 


Rea(x,y) = 5(a(x,9) +a(y,x)) 


should not be confused with 


1 


Re(a(x,y)) = 5 (ay) + a(x,y)). 


The former defines a sesquilinear form, the real part of a, but the latter generally does 
not. It is true, however, that Rea(x,x) = Re(a(x,x)) for all x € D(a). 

From Definition 11.48 we recall that a form a on V is said to be bounded if there 
exists a constant C > 0 such that 


Ja(u,v)| <Clluliivll, uve. 


This definition can be extended to forms in H as follows. 
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Definition 12.3 (Continuous forms). An accretive form a in H is called continuous if 
there exists a constant C > 0 such that 


la(x,y)| <Cllallallylla, usy € D(a). 


A sufficient condition for continuity will be given in Proposition 13.42. 


12.1.a Closed Forms 


The following definition is motivated by the simple fact, observed in Proposition 10.3, 
that a linear operator A is closed if and only if its domain D(A) is a Banach space with 
respect to the graph norm. 


Definition 12.4 (Closed forms). An accretive form a in H is called closed if D(a) is a 
Hilbert space with respect to the norm || - ||q. 


The following two propositions express some robustness properties of closed forms. 
Among other things, the first proposition clarifies the relation between Definition 12.1, 
where accretivity of forms in H was defined through the condition 


Rea(x,x) > allx||’, x € D(A), 


and Definition 11.49, where accretivity of a form on a Hilbert space V was defined 
through the condition 


Rea(x,x) > axle, x eV. 


In the former case, one could view a as a form on V = D(a) and ask why norms are taken 
in H rather than in V. As it turns out, except for the numerical value of the constant, this 
leads to the same definition. 


Proposition 12.5. A closed form a in H is accretive (respectively coercive, continu- 
ous) if and only if a, as a form on the Hilbert space V = (D(a), ||- |la), is accretive 
(respectively coercive, bounded). 


Proof Only the assertion concerning coercivity needs proof. We must prove that there 
exists a constant @ > 0 such that 


Rea(x,x) > al|x||?, x € D(a), (12.4) 
if and only if there exists a constant B > 0 such that 
Rea(x,x) > Bl|x||2,. x € D(a). (12.5) 


If (12.4) holds, then for all x € D(a) we have (1+ @)Rea(x,x) > @Rea(x,x)+a@||x||? = 
ot\lx||% and therefore (12.5) holds with B = 725. 

Conversely, if (12.5) holds, then for all x € D(a) we have Reale) > Bl\lx|[2 = 
B(Rea(x,x) + ||x||?). This forces 0 < B < 1 and (12.4) holds with a = IF 
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Proposition 12.6. Let a be a closed accretive form in H. If V := D(a) admits an inner 
product (-|-)y turning V into a Hilbert space such that the inclusion mapping from V 
into H is bounded, then the associated norm || - ||y is equivalent to the norm ||-||q- 


Proof Define a norm || - ||| on V by 


lIvlll <= Ivilv +[lvlla, vieV. 
We claim that V is complete with respect to ||| - |||. Indeed, if (v;)n>1 is a Cauchy se- 
quence with respect to ||| - |||, then it is Cauchy with respect to both || - ||y and || - ||a. By 


completeness there exist v’, v” € V such that limy—00 || Va — V’ || v = limp eo ||Vn — "|| q = 0. 
Since the inclusion mapping from V into H is bounded with respect to both norms, 
we also have limp. ||Vn — V'|| = limp seo || Vn — V”|| = 0 in H. It follows that v’ = v” 
as elements in H, hence also as elements of V. Setting v := Vv = v’, we then have 
Lim). |||Vn — v||| = 0, proving the completeness of V with respect to ||| - |||. Since ||||y < 
|||z«||| and |lz||q < |l|u||| for all uw € V, the open mapping theorem can be applied to find 
that both || -||v and || - ||, are equivalent to ||| - |||. 


12.1.b Gelfand Triples 


Motivated by Proposition 12.6 we shall now consider the abstract setting where we are 
given a Hilbert space V which is continuously embedded into another Hilbert space H, 
meaning that there exists a bounded injective operator i: V + H. We shall write 


(-|-) and || - | 
respectively 
(-|)v and || - lv, 


for the inner products and norms of H and V. Identifying elements of V with their 
images in H, without loss of generality we may (and will) assume that, as a set, V is a 
subspace of H and i is the inclusion mapping. We write 


VO 
to summarise this state of affairs. 


Definition 12.7 (Gelfand triples). A Gelfand triple is a triple (i,V,H), where H and V 
are Hilbert spaces and i: V <> H is a continuous and dense embedding. 


Example 12.8 (Gelfand triples from closed forms). If a is a densely defined closed 
accretive form in H, then (i, D(a), H), with i the inclusion mapping from D(a) into H, 
is a Gelfand triple. 
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The concrete examples covered by Example 12.2 will be discussed in Section 12.3, 
where the connection with weak solutions to boundary value problems will be made. 
This connection will be made more explicit in operator theoretic terms in Section 12.4. 

Our main aim is to connect Gelfand triples with the theory of closed operators. We 
will prove that if (i,V,H) is a Gelfand triple and a is a bounded accretive form on V, 
then it is possible to associate a densely defined closed linear operator A with a such 
that D(A) C V and 


(Au|v) =a(u,v), ue D(A), vEeV. 


Moreover, suitable bounds on the resolvent of A can be given. 
We start with some preparations. 


Definition 12.9 (Conjugate dual). The conjugate dual of a Hilbert space V is the vector 
space V’ of all mappings @ : V — K that are conjugate-linear in the sense that 


g(u+v)=o(u)+o(v), O(cv) =co(v), uveV,cEK, 
and bounded in the sense that 
Ig(v)|<Cllvlv, ve, 
where C > 0 is a constant. 


It is routine to check that the space V’ is a Banach space in a natural way with norm 


lellve:= sup |9(v)]- 


IIvllv<1 


In the presence of a continuous embedding i: V <> H, every element h € H defines an 
element @, € V’ in a natural way by defining 


dn(v) = (Ali), ve, 
and we have 


Ileallye < sup |Ali)I < [aillal. (12.6) 
IIvlv<1 
As a mapping from H to V‘, the mapping @ : ++ @y is linear. Additivity is clear, and for 
the scalar multiplication we have 


Pen(v) = (chli(v)) = c(hli(v)) = conv), 


SO Op = CO,. The estimate (12.6) shows that this mapping is bounded with norm ||@|| < 
||i||. We claim that if the inclusion mapping i has dense range, then @ is injective. Indeed, 
if @, = 0, then for all v € V we have (h|i(v)) = @,(v) = 0, and since i has dense range 
this is only possible if h = 0. 


404 Forms 


Composing i and @, every v € V defines an element j(v) := (@ 0/)v in V’, and we have 


I(v)(u) = di(u) = (i(v) |i), uv EV. 
The mapping j : V — V’ thus obtained is linear. 


Proposition 12.10. If i: V — H has dense range, then the mapping @ : H — V' is 
injective and has dense range. 


Proof Injectivity has already been observed, so it remains to prove the dense range 
property. The Riesz representation theorem sets up a norm-preserving conjugate-linear 
bijection p : V > V*, and a norm-preserving conjugate-linear bijection o : V* + V’ is 
obtained by mapping a functional v* € V* to the conjugate-linear mapping v’ € V’ given 
by v/(v) := (v,v*). Combining these identifications, we obtain a norm-preserving linear 
bijection oop : V — V'. By Proposition 4.31 the injectivity of i implies that its adjoint 
i* has dense range in V, and oo p maps this range to a dense subspace of V’. The claim 
follows from this by observing that ¢ = oop o/% since for all h € H and v € V we have 


((oopoi*)h)(v) = (v,(p o*)h) = (v|i*h)y = (hly)y = (Ali(v)) = only). 


From now on we assume that V is densely embedded in H, omit the mappings /, j,, 
and think of V as a dense subspace of H and H as a dense subspace of V’. 


Definition 12.11 (The linear operator associated with a form). The operator A associ- 
ated with a densely defined closed accretive form a in H is defined by 


u€ D(A) and Au=h © ue D(a) and (h|v) = a(u,v) for all v € D(a). 


Since D(a) is dense in H, the element 4 € H is uniquely defined and thus A is well 
defined as a linear operator in H, linearity being clear from the definition. 

Without imposing further properties on a this definition is not very useful. Under 
appropriate additional assumptions on a, the next theorem provides some interesting 
properties of the associated operator. 


Theorem 12.12 (Resolvent estimate — bounded coercive forms in V). Let (i,V,H) bea 
Gelfand triple and let A be the linear operator in H associated with a bounded coercive 
form aon V. Then A is densely defined and closed, and for all A € C with Red > 0 we 
have —X € p(A) and 


ape “1 Cy 1 
IA+A MH <Reg WA+ayi< (1+ G) ap 


where C and a are the boundedness and coercivity constants of a. 
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Proof Fix A € C with Red > 0. As a form on V, 
dy (u,v) :=a(u,v)+A(ulv), uvevV, 
is bounded and coercive: this follows from 
Jaa (u,v)| <Ja(u,v)| + |Al|lellllvll < Cllely lolly + IAAP lly llvllv 
and 
Rea (v,v) = Rea(v,v) +ReA(v|v) > Rea(v,v) > al|v||z. (12.7) 


Denote by Ay the bounded operator on V associated with a through Proposition 9.15, 
so that (Ayu|v)y = a(u,v) for all u,v € V. The bounded operator on V associated with 
da equals Ay, := Ay + A/*i. By the Lax—Milgram theorem applied to the form ay, Ay, 
is boundedly invertible with I|Ay, 7 giv) < a~!. Composing Ay. with the isometric 
isomorphism o 0p from V onto V’ from the proof of Proposition 12.10, we may identify 
Aya with a bounded operator Aj,, from V to V’ which is boundedly invertible and 
satisfies Avallewy) <a! 

Let Ry, denote the restriction of AVA to H, viewed as a bounded operator from H to 
H. As such it is bounded and injective. Define the closed operator (By ,D(B,)) in H by 
D(B,) := R(R,) and By := Roe —.To see that By is densely defined in H, note that 


D(B,) = R(R,) = R(Ay ala) a {Ay yh :he H} 


is dense in V (and hence in #) since Avy : V' — V is an isomorphism, V is dense in H, 
and H is dense in V’. For all u, f € H, 


u€D(B,) andByu=f Sue R(R,) and Ryu = Aut f 
= u€VandAyju=Au+f 
= u€V and (f|v) =a(u,v) forallv eV 
= u€ D(A) and Au= f. 


It follows that A = By, so A is densely defined and closed, andA +A=A+B, = ce 
This, in turn, implies that A +A is injective and surjective (the latter since Ry is defined 
on all of H) and hence boundedly invertible. 

For v € D(A), the accretivity of a gives 


I|(A + A)v||lvl] > (A +4) v|y)| 


| 
Re((A +A)v|v) = ReA||v||? +Rea(v,v) > ReAl|v||”. 


WV W 


This gives the first resolvent estimate. 
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Fix an arbitrary h € H. Defining u:= (A +A)~'h = Ryh € V and using that h = 
(A+A)u= Ry\u = Aj, ,u=Ay,u, we have 


ay (u,v) =a(u,v) +A(ulv) = (Ayguly) = (Alv), vev. (12.8) 
Taking v := wu in (12.8), by (12.7) we obtain 
@llul|i; < Reay (u,w) = Re(h|u) < ||h||\u]. (12.9) 
By (12.8) and (12.9), 


C 
JA lull? < [(Alu)| + fa(u,u)] < [Allllell +Cllulli < (1 at =) [Allee 


where C is the boundedness constant of a. Substituting back the definition of u we obtain 
the bound 


Cc 
=i) 
Al|A+A) Tall < (14 =) [lall, eH. 


This gives the second resolvent estimate. 


In applications, V often arises as the domain of a densely defined closed form a in H 
(cf. Example 12.8). In this setting, Theorem 12.12 implies the following result. 


Corollary 12.13. Let A be the linear operator in H associated with a closed continuous 
accretive form a in H. Then A is densely defined and closed, for all A € C with Rea > 0 
we have —A € p(A) and 


1 
A) '|<—, R ; 
|A+Ay <a, Red >0 
and for all 5 > 0 we have 
C, 
|a+ay i <a Red > 6, 


where Cs is a constant depending only on 6 and the continuity constant C of a. 


Proof Consider the Hilbert space V = (D(a), ||- ||a) and let i: V © H be the inclusion 
mapping. By Proposition 12.5 and its proof, for all 6 > 0 the form 


a°(u,v) := a(u,v) + 6(ulv) = a(u,v) + S(i*iulv)y, u,v eV, 
is bounded and coercive as a form on V, with boundedness constant C + 6||i*i|| and 
accretivity constant 5/(1-+ 5). The operator associated with a®° is A+ 6. By Theorem 
12.12 this operator (and hence A itself) is densely defined and closed, and for all ReA > 
0 the operator A + 6 +A is boundedly invertible and satisfies the resolvent bounds 
C+ ally 1 


1 
Ay |S Atl <i : 
|(a+5+ay] lat+8 +a "I< (14 Sy) gy 


Red’ 


Since 6 > 0 was arbitrary, the corollary follows from this. 
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Further properties of the operators A in the theorem and its corollary will be obtained 
in the next chapter (see Theorem 13.37). Without the continuity assumption it is still 
possible to prove a version of the first resolvent estimate (see Theorem 13.36). 

An elegant application of the corollary is the following duality result. Recall that if a 
is a form in H, we define a*(x,y) := a(y,x) for x,y € D(a). 


Corollary 12.14 (A* is associated with a*). Let A be the densely defined closed operator 
in H, and suppose that one of the following two conditions is satisfied: 


(1) A is the operator associated with a closed continuous accretive form a in H; 
(2) A is the operator associated with a bounded coercive form a on V, where (i,V,H) 
is a Gelfand triple. 


Then A* is the densely closed defined operator associated with the form a*. 


Proof (1): Since A is densely defined, D(a) is dense. Since D(a*) = D(a) by defini- 
tion, it follows that a* is densely defined. From 


(v|v)q* = Rea(v,v) + (v|v) =Rea(v,v) + (vv) = (va, VEY, 


it follows that a* is continuous and accretive. Let B denote the densely defined closed 
operator associated with a* If x € D(B), then for all y € D(A) we have 


(y|Bx) = (Bxly) = a* (x,y) = a(y,x) = (Aylx). 
It follows that x € D(A*) and A*x = Bx. This shows that B C A* 

Next let x € D(A*). By Corollary 12.13 applied to a*, the operator / + B is invertible, 
so there exists y € D(B) such that (I+ A*)x = (1+ B)y. Since B C A*%, we have y € 
D(A*) and (+ A*)x = (I+A*)y. By Corollary 12.13 applied to a, the operator J+ A is 
invertible and therefore so is its adjoint J + A%* It follows that x = y € D(B). This shows 
that A* CB. 


(2): This is proved in the same way, this time using Theorem 12.12. 


12.1.c Closable Forms 


We return to the setting of forms in a Hilbert space H considered at the beginning of 
Section 12.1. 


Definition 12.15 (Closable forms). An accretive form a in H is called closable if there 
exists a closed accretive form a in H extending a, that is, a is closed and accretive, 
D(a) C D(a), and a(u,v) = a(u,v) for all u,v € D(a). 


The following proposition, in which we view D(a) as a (not necessarily complete) 
normed space with norm || - ||,, gives a useful necessary and sufficient condition for a 
form a in H to be closable. It should be compared with Proposition 10.12. 
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Proposition 12.16. For a continuous accretive form a in H the following assertions are 
equivalent: 


(1) ais closable; 
(2) every Cauchy sequence in D(a) converging to 0 in H converges to 0 in D(a). 


The hard implication is (2)=-(1). It is tempting to try to prove it as follows. By con- 
tinuity, a extends to an accretive form a on the completion of D(a) with respect to the 
norm || - ||q. It is not clear, however, whether the inclusion mapping of D(a) into H ex- 
tends to an embedding of its completion into H. This difficulty explains why we have 
to proceed more carefully. 


Proof Set V := D(a) with norm ||- ||y := ||- ||a. We note that assertion (2) can be 
equivalently stated as follows: 


(2') Whenever a sequence (v,)n>1 in V satisfies lim,—,..V, = 0 in H and 


lim Rea(vm—Vn,¥m — Vn) = 9, 
m,n—0o 


then limp. Re a(Vn, Vn) = 0. 


(1)=(2): Suppose that a has a closed extension a whose domain V is complete with 
respect to ||- |lg. If (un)n>1 is a sequence in V such that lim, ,..u, = 0 in H and 
LiMn n—yco Re a(Um — Un, Um — Un) = 0, then the sequence (up )n>1 is Cauchy with respect 
to || ||, and hence, since @ extends a, with respect to || - ||z. Since V is complete with 
respect to the norm || - ||q, the sequence is convergent in V, say to 7 € V. The sequence 
(Un)n>1 is Cauchy in H as well, and since V embeds in H we have Un — uin H. Since 
we assumed that u,, > 0 in H it follows that u = 0. Hence, lim,_,..u, = 0 with respect 
to ||- ||a, and this in turn implies that lim, _,.. Rea(un, Un) = 0. 


(2)=(1): The proof proceeds in three steps. 


Step I — Define V to be the set of all ¥ € H for which there exists a Cauchy sequence 
(Vn)n>1 in V such that lim,_,..¥, = V in H. In what follows we refer to a sequence with 
these properties as an approximating sequence for V. 

We begin by showing that the limit 


a(u,v) := Jim, a(Un, Vn) 
exists whenever (uy )n>1 and (V_)n>1 are approximating sequences for u,¥ € V, and that 
this limit is independent of the choice of approximating sequences. 


To begin with the existence of the limit, we note that for all m,n > 1 


|(Um, Vm) =, a(n, Vn)| = |a( Uy, = Un; Vm) + A(Un, Vin = Vn)| 


< Cllum = Unlla sup lVnlla +C|lVm _ Vnlla Sup llunla (12.10) 


m>1 n>1 
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by the continuity of a. Since (up )n>1 and (Vp)n>1 are Cauchy in V, they are bounded and 

we conclude from (12.10) that (a(un,Vn))n>1 is a Cauchy sequence, hence convergent. 
As to the well-definedness of the limit, suppose that # and Vv are approximated by the 

sequences (u/,),>1 and (vj,),>1 with the properties as stated. Then, by a similar estimate, 


|A(un, Vn) = 8» Yn)| < Cll = Uy lla SUP ||¥n lla +Cll¥n = Vnlla SU ||eenla- 
m21 n>1 


Now 


Jen — wes = [etn = W|I? + Re aun — thy tn — ty). 


The first term on the right-hand side tends to 0 as n > © since u, — u and u/, > u. The 
second term tends to 0 because of the restatement of (2) mentioned at the beginning of 
the proof. In the same way we obtain ||v, — v,,||2 — 0 and the proof can be completed 
as before. 

It is clear that V C V and that the resulting mapping a: V x V — C is sesquilinear, so 
it defines a form, is continuous and accretive, and extends a. 


Step 2 — We show that V is dense in V with respect to the norm || - ||z. To this end let 
v EV and let (v_)n>1 be an approximating sequence. We claim that limy_5.0 || Vn — la = 
0. Since we already know that lim,_,.. Vy» = V, it suffices to prove that lim,_,.. Re @(vp_ — 
¥,Vn —¥) = 0. This follows from 


lim Rea(v, —¥,v, —V) = lim lim Rea(vp — Vin, Vn — Vm) = 0, 
n-oo n—-com—yoo 


the first of these identities being a consequence of the definition of a along with the fact 
that Vy — Vm 4 Vp —¥ in H and Rea((v_ — Vm) — (Va — Ve), (Va — Vm) — (Vn — Ve) 4 O.as 
l,m — ©, 


Step 3 — To prove that a is closed, suppose first that (v,),>1 is a sequence in V which 
is Cauchy with respect to || - ||z. This means that (v,),>1 is Cauchy in H and 


lim Re@(vm — Va, Vm — Vn) = 0. 
m,n—oo 


Let lim,_;.0 V, =: V, the convergence being in H. Since a extends a we have 


lim Rea(vn — Va,V¥m — Vn) = 0. 
m,n—oo 


The very definition of V implies that ¥ € V, and as in Step 2 we have 


: 12 ‘ 7, = 2 1/2 
dim |[vn — Vila = lim Rea(vn V,Vn—V) + [lyn — 9 


= lim lim Rea(v_—Vm,Vn — Vm) + ||vn — 0]? = 0. 
noo m—oo 


Suppose next that (¥;,),>1 is a sequence in V which is Cauchy with respect to || - ||z. 
Since V is dense in V by Step 2, we may choose elements v,, € V such that ||vy —¥n|| < 
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1/n. Then (vp)n>1 is Cauchy in V, and by what we just proved it has a limit Vin V. Then 
V is also a limit for (Vp) p51. 


The form a constructed in the above proof is called the closure of a. Further properties 
of a are discussed in Problem 12.5. 


12.2 The Friedrichs Extension Theorem 


It has been shown in Corollary 10.43 that if A is a densely defined operator which is 
positive in the sense that (Ax|x) > 0 for all x € D(A) and has the property that J + A has 
dense range, then A is selfadjoint. The next theorem states that if we give up the dense 
range condition, selfadjoint extensions still exist. 


Theorem 12.17 (Friedrichs extension). Let A be a densely defined positive operator 
acting in a complex Hilbert space H. Then: 


(1) the form a in H given by D(a) := D(A) and 
a(x,y) ‘= (Aaly), xy € D(A), 


is densely defined, positive, continuous, and closable; 
(2) the operator associated with the closure of a is a positive selfadjoint extension of A. 


Proof (1): Itis clear that a is densely defined and positive, and continuity of a follows 
from the Cauchy—Schwarz inequality: 


Ja(x,y)I? < a(x,x)a(y,y) < [bllallylla- 


Here, the positivity of A was used to see that a(x,x) = (Ax|x) > 0 and hence a(x,x) = 
Rea(x,x) < ||x||2. To prove that a is closable we check the criterion of Proposition 
12.16. Keeping in mind that a(x,x) > 0 for all x € D(A), pick a sequence (Vp)n>1 in V 
such that lim, —.. Vy = 0 in H and lim, po (Vm — Vn, ¥m — Vn) = 0. We must show that 
limy—00 A(Vn, Vn) = 0. 

Given € > 0, for large enough m,n we have 


O0<a(n- Vay Vm — Vn) = (AV — AvnlVm = Vn) 
= (Avin|Vin) + (AVn|Vn) — 2Re(Avin|Vn) < €. 


Fixing m, upon letting n — © and using that v, — 0, we obtain 


0 < (Avm|¥in) + limsup(Av,|Vn) < €. 


n—-oo 


Since A is positive, this can only happen if limsup,,_,..(Avnlvn) < €, and since € > 0 
was arbitrary this forces limp 5.0 (Vn, Vn) = 0. 
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(2): By (1) the form a is densely defined, continuous, closable, and satisfies a(v,v) > 
0 for all v € D(a). Its closure @ enjoys the same properties, and therefore Corollary 12.13 
allows us to associate a positive operator B with o(B) C {Red > 0}. By Proposition 
10.41 (which applies since positive operators are symmetric; here we use the assumption 
that the scalar field is complex, cf. the remark after Definition 10.33), this implies that B 
is selfadjoint. Alternatively one may observe that the positivity of A implies that a* = a 
and hence a@* = @, and therefore B = B* by Corollary 12.14. 


If A is a densely defined closed operator from H to another Hilbert space K, then by 
Theorem 10.44 the operator A*A with domain D(A*A) = {x € D(A) : Ax € D(A*)} is 
positive and selfadjoint. The next result relates this operator with the theory of forms. 


Proposition 12.18. Let A be a densely defined closed operator from H to another 
Hilbert space K. The form a on D(a) := D(A) defined by 


a(u,v) :=(Au|Av), u,v € D(A), 


is closed, continuous, and accretive, and A*A coincides with the operator associated 
with a. 


Proof Densely definedness, continuity, and accretivity are clear. For v € D(a) we have 
2 2 2 2 
IIvlla = [IvllP + a(y,v) = [lvl + [Av 


from which we deduce that || - || is equivalent to the graph norm of A. Since A is closed, 
D(a) = D(A) is complete with respect to || - ||, and the closedness of a follows. 

Let B be the operator associated with a. Then B satisfies (Bu|u) = a(u,u) = ||Au||? > 
0 for all u € D(B), so B is positive. By the definition of the domain of an operator 
associated with a form we have 


u€ D(B) & we Vand df € D(a): (flv) =a(u,v) for all v € D(a) 


= u€Vandif € D(A): (fv) = (AulAv) for all v € D(A) 
< u€ D(A), Au € D(A*), and Bu = A*(Au). 


This shows that B = A*A. 


12.3 The Dirichlet and Neumann Laplacians 


We now turn to some examples that connect the theory developed in the preceding 
sections to the boundary value problems studied in the previous chapter. 
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12.3.a The Laplace Operator 
Let V := H!(R¢) = W!7(R¢) and consider the sesquilinear form a on V defined by 


a(u, Vv) = [Vu Wax, u,v eV. 
R 


This form is bounded and positive on V; the easy proof is left to the reader. We claim 
that the densely defined closed operator A in L7(IR¢) associated with a equals —A, where 
A is the weak Laplacian in L?(IR“) with domain D(—A) = H?(R®) (cf. Theorem 11.29). 

To prove the claim we begin by noting that if u € H7(R“), then dju € H'(R¢) = 
Ww! (IR¢) by Theorems 11.29 and 11.31, and therefore 


a(u,v)=-) i) d?u(x)v(a) dx = —(Au|v) (12.11) 


for all v € C?(R%). By approximation this identity extends to all v € H!(R¢). This 
means that uw € D(A) and Au = —Au. 
Conversely, if u € D(A), then wu € H'!(IR®) and for all v € C2(R“) we have 


[,Autovtax (Au|v) = a(u,v) I, Vu-Vvdx = -f, u(x)Av(x) dx 


by the definition of weak derivatives. This shows that u admits a weak Laplacian given 
by Au = —Auw in the sense of Theorem 11.29, and therefore u € H?(IR“) by this theorem. 
Another description of the operator A can be given on the basis of Theorem 10.44 


and Proposition 12.18. These results identify the operator associated with the form a 
defined by (12.11) to be —V*V with domain D(V*V) = {f € D(V) : Vf € D(V)}, 
where D(V) = H!(R¢). 

Summarising this discussion, we have proved: 


Theorem 12.19. The following operators in L* (R?) are equal, with equal domains: 
(1) the weak Laplacian A with domain 
D(A) ={f €L?(D): f admits a weak Laplacian in L?(R¢)}; 


(2) the operator —A, where A is the operator in L?(R?) associated with the form a on 
H!'(R®) given by 


a(u,v) =i Vu-Vvdx; 
Rd 
(3) the operator —V*V with domain 
D(V*V) = {f € D(V): Vf e D(V*)}, 


where V is the weak gradient, viewed as a densely defined closed operator from 
L?(R¢) to L?(R4,C4) with domain D(V) = H!(R¢). 
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A fourth description of A will be added to this list in Section 13.6.c, namely, as the 
generator of the heat semigroup on L?(R¢). 


12.3.b The Dirichlet Laplace Operator 


Let D be a nonempty bounded open subset of R“. As before we write 
1,2 
Hy (D) :=W,* (D). 


Let V := H} (D), viewed as a dense subspace of L?(D), and consider the form apj; on V 
given by 


apir(u, Vv) i= [ Vu. War, uve V. (12.12) 


This form is bounded, positive, and coercive (by the Poincaré inequality) as a form on 
V. The densely defined closed operator in L7(D) associated with it is denoted by —Apjr- 
The operator Apj; is called the Dirichlet Laplacian on L* (D). 

To substantiate the claim that Ap;, correctly models the Dirichlet boundary condition, 
consider a function u € C?(D) which satisfies u| gp = 0. If v € C2(D), an integration by 
parts gives 


(Aulv) = f (aura =— | Vu. Vox = —apic(u,r, 


where the last identity is justified by the fact that u belongs to Hj (D) by Theorem 11.24. 
Since C2(D) is dense in Hj (D) it follows that u € D(Apir) and Apiru = Au. 
Using Theorem 11.37 it follows that 


D(Apir) = {u € Hj (D)M Hije(D) : Au € L’(D)}, (12.13) 


To prove this we must show that a function wu € Hd (D) belongs to H?,,(D) if and only if 
there exists f € L?(D) such that 


[prac | vu-Wvar. v € Hy (D). (12.14) 
D D 


If such a function f exists, then u is the weak solution of the Poisson problem —Au = f 
and Theorem 11.37 implies that u € Hp... (D). In the converse direction, suppose that 
u € H4(D) belongs to H7,,(D). If ¢ € C2(D) is a given test function, select an open set 
U € D containing the support of @ and use the fact that wu € H?(U) to see that 


[yu-voar= | vu-vodr=— | (anoar=— f (au)oae. 


Since Af € L?(D), both sides depend continuously on @ with respect to the norm of 
Hj(D). Since ¢ € C2(D) is dense in Hi (D), it follows that this identity extends to 
arbitrary ¢ € Hj (D). This proves that w satisfies (12.14) with f := Au € L?(D). 
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The result of Remark 11.38 also implies, by the same reasoning, that if D has a C?- 
boundary, this domain characterisation improves to 


D(Apir) = Hg (D) NH? (D). 


12.3.c The Neumann Laplace Operator 


As before we let D be nonempty bounded open subset of R“ As a variation of the 
preceding example, we may take V := H'(D) = W!?(D), viewed as a dense subspace 
of H := L’(D), and consider the form adNeum on V given by 


ANeum (U,V) := | vu- War, uve V. 


The only difference with (12.12) is the different choice of the space V. This form is 
bounded and positive as a form on V. The densely defined closed operator in L?(D) 
associated with it is denoted by —Aneum. The operator Aneum is called the Neumann 
Laplacian on L?(D). 

To substantiate the claim that ANeum correctly models the Neumann boundary con- 
dition, let us assume for the moment that D has a C ' boundary. Consider a function 


u € C?(D) which satisfies of ap = 0, where v is the outward normal vector on 0D. If 
v €C’(D), then Green’s formula asserts that 
" Ou_ a 
(Aulv) = | (auyvax = Sas— [ Vu- Vvdx 
D aD OV D 


— -{ Vu-Vvdx = —ANeum(U,V), 
D 


where S is the normalised surface measure on dD. Since C?(D) is dense in H'(D) by 
Theorem 11.27, it follows that wu € D(ANeum) and ANeumu = Au. 
As for the Dirichlet Laplacian, Theorem 11.44 implies that 


D(ANeum) = {u € H'(D) NHZ,,(D) : Au € L?(D)}. (12.15) 
If D has a C?-boundary, this improves to 


D(Aneum) = H(D). 


12.3.d Selfadjointness 


The following result is an immediate consequence of Theorem 12.17 (noting that the 
forms involved are closed): 


Theorem 12.20 (Selfadjointness of the Laplacian). The Laplacian on L?(R“) and the 
Dirichlet and Neumann Laplacians on L? (D) with DC R¢ nonempty, bounded, and 
open, are positive and selfadjoint. 
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These three operators also fall into the setting of Theorem 10.44. Indeed, by the 
proposition, all three Laplacians are of the form V*V, where V is the gradient viewed as 
a densely defined closed operator from H to K, where H = L*(U) and K = L?(U;R“) 
with U € {R“,D}. This gives an alternative proof of their selfadjointness. 


12.3.e Operators in Divergence Form 


Let D be a nonempty bounded open subset of R@ and consider a matrix-valued function 
a: D— M,(Ik) satisfying the following conditions: 


(i) the coefficients a;; : D — K are measurable and bounded; 


(ii) for all x € D and &€ € K¢ we have Re aa aij (x)&i8 >0. 


Condition (ii) is an accretivity condition and is more general than its coercive counter- 
part used in our treatment of the Sturm—Liouville problem in the preceding chapter. 

Under the assumptions (i) and (ii), a bounded accretive form dg on both V := Hj (D) 
(in the case of Dirichlet boundary conditions) and V := H ! (D) (in the case of Neumann 
boundary conditions) can be defined by 


Aq (u,v) = [ avu- Wodk, u,vEeV. 


The operator on L7(D) associated with ay is usually denoted by 
—div(aV) 


in recognition of the fact that (at least formally) the Hilbert space adjoint of V equals 
—div. The operator div(aV) is often referred to as a second order differential operator 
in divergence form. This operator is selfadjoint if the coefficients satisfy the symmetry 
condition aj; = Gj. 


12.4 The Poisson Problem Revisited 


We now revisit the Poisson problem —Au = f by viewing it as a special instance of the 
abstract problem 


Au = x, 


where A is assumed to be a closed operator acting in a Banach space X, x € X isa 
given element, and u € X is the unknown. One could define a classical solution as an 
element u € D(A) such that Au = x, but this is not what we did in Section 11.2. Instead, 
we considered weak solutions defined in terms of the sesquilinear form with which A is 
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associated. A third option is to use duality to define a scalar solution to be an element 
u € X with the property that 


(u,A*x") = (x,x"), x* © D(A"). 


In line with standard functional analytic terminology it would be more appropriate to 
call this a weak solution, but the usage of the term ‘weak solution’ in connection with 
integration by parts using test functions is well established. 


Proposition 12.21. Let A be a densely defined closed linear operator on a Banach 
space X and let x € X be a given element. For an element u € X the following assertions 
are equivalent: 


(1) wis a classical solution of Au = x, that is, u€ D(A) and Au = x; 
(2) u isa scalar solution of Au = x, that is, (u,A*x*) = (x,x*) for all x* € D(A*). 


IfX =H is a Hilbert space and A is the operator associated with a form a on a contin- 
uously and densely embedded Hilbert space V, then (1) and (2) are equivalent to: 


(3) wis a weak solution of Au = x, that is, u € V and a(u,v) = (a|v) forall v € V. 


Proof The implications (1)=(2) and (1)=(3) are trivial. The implication (2)=(1) is 
an immediate consequence of Proposition 10.20. Finally, if (3) holds, then by the defi- 
nition of the associated operator we have u € D(A) and Au = x, so (1) holds. 


In the special case where A is the Dirichlet or Neumann Laplacian, Proposition 12.21 
implies that every weak solution of the Poisson problem —Au = f with f € L?(D) is 
in fact a strong solution. In view of the domain identifications (12.13) and (12.15), this 
recovers the maximal regularity results of Theorems 11.37 and 11.44. 

For the sake of completeness we also mention a maximal regularity result for the 
Poisson problem on —Au = f on the full space R4. 


Theorem 12.22 (Maximal regularity for R¢). Let f € L?(R4). If u is a weak solution of 
the Poisson problem —Au = f on R4, then u € H?(R¢). 


Proof Yf wis a weak solution, an integration by parts gives that u admits a weak Lapla- 
cian. The result now follows from Theorem 11.29. 


12.5 Weyl’s Theorem 


This section is a digression from the main line of development and is dedicated to a 
proof of Weyl’s celebrated asymptotic formula for the number of eigenvalues of Dirich- 
let Laplacian. 
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12.5.a Spectrum of the Dirichlet and Neumann Laplacians 


As a warm-up we compute the spectrum of Apj, and ANeum in L7(0, 1). 


Example 12.23. Both —Ap;, and —Aneum are positive and selfadjoint as operators in 
L’(0,1) (by Theorem 12.20) and their spectrum is contained in [0,cc) (by Theorem 
12.17). We will use the fact that every u € C?[0, 1] is included in their domains, that 
the Laplacians of such a function u are given by taking classical second derivatives 
pointwise, and that a function u € C”[0, 1] belongs to Hd (0,1) if and only if (0) = 
u(1) = 0; we leave the elementary proof to the reader (see Problem 11.2 for a more 
precise result). 

The functions u,(@) = sin(zn@), n > 1, satisfy —Apirtn = —uy) = 17n* un and obey 
Dirichlet boundary conditions. Moreover, by Theorem 3.30, these functions form an 
orthonormal basis for 7. 1). By Proposition 10.32, this implies 


—Apir) = {77n? : n=1,2,...}. 


with eigenfunctions u,(@) = sin(an@). Likewise, the functions v,(@) = cos(2n@), n > 
0, satisfy —ANeumln = —U) = nu, and obey Neumann boundary conditions. Again 
by Theorem 3.30, and form an orthonormal basis for L?(0, 1). This implies 


(Anan) = fe AW 19 es 
Turning to higher dimensions, begin with a simple observation. 
Proposition 12.24. Let D be a nonempty bounded open subset of R4. 


(1) Apir is both injective and surjective, and hence invertible; 

(2) if, in addition, D is connected and has C!-boundary, the null space of AXeum consists 
of the constant functions and its range is the orthogonal complement of the constant 
functions. In particular, its range is closed and 


dim N(Aneum) = codimR(ANeum) = 1. 


Extending the corresponding definition for bounded operators, a Fredholm operator 
is a closed operator whose null space is finite-dimensional and whose range has finite 
codimension. With the same proof as in the bounded case, the second condition implies 
that the range is closed. The index of such an operator A is defined as 


ind(A) := dim N(A) —codimR(A). 


Proposition 12.24 implies that both Apj, and ANeum (the latter under the stated more 
restrictive assumptions on D) are Fredholm operators with index 0. 


Proof (1): If Apiru = 0 for some u € D(Apir), then u € Hi (D) and uw is a classical 
solution, and hence a weak solution, of the Dirichlet Poisson problem with f = 0. By 
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the uniqueness of weak solutions it follows that u = 0. Likewise surjectivity follows 
from the existence of weak solutions for any f € L?(D) combined with Proposition 
12.21, according to which weak solutions are strong solutions. 

(2): This is proved in the same way, using that the problem Au = f with Neumann 
boundary conditions has a weak solution for a given f € L?(D) if and only if f, fdx =0, 
and that uniqueness of weak solutions holds in H},(D) = {u€ H!(D): Jy udx=0}. 


For the proof of the next theorem we isolate a lemma that will also be useful in the 
next chapter. 


Lemma 12.25. Let A be a closed operator on a Banach space X with nonempty resol- 
vent set and suppose that R(Ag,A) is compact for some Ag € p(A). Then: 


(1) for all A € p(A) the resolvent operator R(A,A) is compact; 
(2) forall XA € p(A) the following spectral oe theorem holds: 


o(R(A,A))\ {0} = {5 pi HE OCA): 


(3) every pt € O(A) is an eigenvalue with finite multiplicity; 
(4) for all u € o(A), the eigenspace of the eigenvalue Ut for A and the eigenspace of 
the eigenvalue wa for R(A,A) coincide; 

(5) o(A) is either finite, or it is a sequence diverging to ©. 
Proof Once (1) and (2) have been established, (3)-(5) are easy consequences of the 
Riesz—Schauder theorem for compact operators (Theorem 7.11). 

(1): This is immediate from the resolvent identity. 

(2): Fix A € p(A). 

‘C’: Let v € o (R(A,A)) \ {0}. By the Riesz—Schauder theorem, v is an eigenvalue for 
R(A,A). If R(A,A)x = vx with x € D(A), then x = v(A — A)x and Ax = (A — + )x = px 
with uw :=A— i. It follows that 1 is an eigenvalue of A and v = 7 : 


qn" 
‘>’: Let u € o(A). If we had Ca € p(R(A,A)), then by pes computation it 
would follow that u € p(A) and R(u,A) = wal, A)R (a ,R(A,A)). This contra- 


diction shows that wn € 0(R(A,A)). 
(3) and (4): For all x € X and u € o(A) we have 
1 
Au 
This gives (4). By (1) and the Riesz—Schauder theorem, o(R ae \ {0} consists of 
eigenvalues of finite multiplicity. If u € o(A), then Tn is an eigenvalue for R(A,A) of 


finite multiplicity by (2), and then (2) and (4) show that p is an eigenvalue for A of the 
same finite multiplicity. 


x€D(A) and Ax=px ==> R(A,A)x= 
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(5): If o(A) is an infinite set, then so is o(R(A,A)). By the Riesz—Schauder theorem, 
o(R(A,A)) can only accumulate at 0, so o(A) can only accumulate at infinity. 


Theorem 12.26. Let D be a nonempty bounded open subset of R4. Then: 
(1) the spectrum of —Apir is of the form 
0(—Apir) = {A1,A2,...} with 0< Al <Ay <--+- 3 09; 
(2) if, in addition, D has C'-boundary, the spectrum of —ANeum is of the form 
0 (—Aneum) = {A1,A2,...} with O=A) <An <--- 9 &. 
In either case, each A; is an eigenvalue with finite-dimensional eigenspace. 


Proof Let A := —Apir (in the case of Dirichlet boundary conditions) or A := —ANeum 
(in the case of Neumann boundary conditions). We claim that, under the respective 
assumptions on D, the resolvent operators R(A,A) are compact for all A € p(A). To 
prove the claim we recall that D(A) is contained in V := H4(D) (in the case of Dirichlet 
boundary conditions), respectively in V := H!(D) (in the case of Neumann boundary 
conditions). By the Rellich-Kondrachov theorem (Theorem 11.41), in either case the 
inclusion mapping from V into L?(D) is compact. The compactness of R(A,A) now 
follows by viewing it as the composition of three bounded operators, one of which 
is compact: (i) R(A,A), viewed as a bounded operator from L?(D) to D(A), (ii) the 
inclusion mapping from D(A) into V, which is bounded by the closed graph theorem, 
the closedness of A, and the boundedness of the inclusion mappings from both D(A) 
and V into L?(D), and (iii) the compact inclusion mapping from V into L?(D). 

Since A is positive and selfadjoint (by Theorem 12.20) we have o(A) C [0,°¢) (by 
Proposition 10.40). By Proposition 12.24 we have 0 € p(—Apir) and 0 € o(—ANeum). 
The result now follows from Lemma 12.25. 


As a variation on the min-max theorem for compact positive Hilbert space operators 
(Theorem 9.4), we prove an explicit formula for the Dirichlet and Neumann eigenvalues 
of the Laplace operator on a nonempty bounded open set D C R?; in the case of Neu- 
mann boundary conditions we make the additional assumption that D is connected and 
has C!-boundary. We denote by 0 < Ay < Az <... and0= fl) < fy <.... the sequences 
of eigenvalues of —Apir and —Aneum, respectively, taking multiplicities into account. 


Theorem 12.27 (Courant—Fischer). With the notation just introduced, 
(1) for alln > 1 we have 


IV lz. 

An= inf sup T zz, ce) 
YCH)(D) yeY Y\\72 
dimtY =n y#0 fle) 


’ 
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where the infima are taken over all subspaces Y of dimension n; 
(2) if, in addition, D has a C!-boundary, then for all n > 1 we have 


9 Mellie) 
Un = inf 
YCH'(D) yeF Iyll200) 
dim(Y)=n y40 


where the infima are taken over all subspaces Y of dimension n. 


Proof We present the case of Dirichlet eigenvalues, the proof for Neumann eigenval- 
ues being entirely similar (the zero eigenvalue [; does not create difficulties since it has 
multiplicity 1; here we use the connectedness assumption). 

We write A := Apir and choose an orthonormal basis (h;)j>1 in L?(D) such that 
—Ahj; = Ajhj; for all j > 1. As was shown in the proof of Theorem 12.26, such a se- 
quence exists by the spectral theorem applied to the compact positive operator A~!; this 
theorem also implies that the span of this sequence is dense in L7(D). 

For n > | let 

Hy-1:= {f € Hg(D): (f|hj) =0, j=0,...,n-1} 
with the convention that Hy = H4(D). 

Step 1 — Fix f € L’(D) and set fy := Yj=1 cj; with cj := (f|h;). Since (Aj) j>1 is an 
orthonormal bases for L?(D) we have fy — f in L?(D) asn > ~. 

Clearly, f — fn L fn in L?(D). We claim that if f € Hd (D), then also f — fy L fy in 
H4(D). In view of 

(sls') uw) = (le’)+(ValVs’), 8,8’ € HG(D), (12.16) 


this amounts to showing that 
(V(f— fn)|Vfn) = 0 
For all j,k > 1 we have 
(VAj|Vhg) = —(Ahj|hx) = Aj(hylh) = Aj jx 
and therefore 


(Vial V fn) = y |cj/PA;. 


Also, for j > 1 we have A; > 0 and 
(Vf|VAj) = —(f|Ahj) = Aj(f hj) =); 


and therefore 


(Vf|V fn) = Yel Vf|Vh;) =25 = ¥ lejl?ay: (12.17) 


i= j=l j=l 
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It follows that 


(Vif —Sfa)lVin) = ¥ leas Yet Aj =0. 
j=l j=l 
This proves the claim. 
By what we just proved, 


VAI? = IVF = fad? + IV fall? > NV fal?- 


This shows that the sequence (fy)n>1 is bounded in Hi (D). By Proposition 3.16, some 
subsequence (fn, )k>1 converges weakly to a limit f in Hj(D). Since also f, > f in 
L?(D) we must have f = f. Thus f, — f weakly in H}(D). By (12.16) this implies 
fn, > f weakly in L?(D). By (12.17) it then follows that 


IV f\| = lim (VV fny) )= Jin Yea =F -\ra;. (12.18) 


jel 


Step 2 — Let ¥Y C H}(D) be any subspace of dimension n. Since H,_; has codimen- 
sion n—1 in Hd (D), the intersection H,_; MY is a nonzero subspace of Hj(D) and 
hence contains a nonzero element f. Applying the results of Step | to f and noting that 
(f|hj) =cj =0 for j = 1,...,n—1, by (12.18) we have 


IV FI? = Yo sles? > An YY lejl? = Aull AIP. 


jen jen 
This proves the inequality 
V 2 
An< inf sup Ivy 


YCHE(D) yey Ily||? 
dim(Y)=n y#0 


Step 3 —If f belongs to the span of {h,...,4n}, then by (12.18), 
n 
IVA? = ¥ les?Ay < An Y esl? = Anll fl’: 
j=l pl 


This proves the inequality 


V 2: 
Anr=> inf sup ale 
YCH|(D) yeY Ily| 
dim(Y)=n y#0 


This completes the proof. 


Corollary 12.28. For alln > 1 we have Un < Ap. 
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12.5.b Weyl’s theorem 


The following celebrated theorem of Weyl gives an asymptotic expression for the num- 
ber of Dirichlet eigenvalues in the interval [0,r] as r > ©. 


Theorem 12.29 (Weyl). Let D be a nonempty bounded open subset of R¢ satisfying 
|AD| =0, let0< Ay <A <... the sequence of eigenvalues of —Apjir on L?(D), taking 
multiplicities into account, and for r > 0 let 


Np(r) := max {n Sls AS r}. 


Then 
. Np(r)  @a 
hima = ny DI, 
where @q = n4/? /T(1+ 5d) is the volume of the unit ball in R¢. 


The condition |9D| = 0 is satisfied if the boundary is a rectifiable curve. 
Before turning to the proof it is instructive to revisit Example 12.23. For the Dirichlet 
Laplacian in L7(0, 1) we obtain 


Np(r) = max {n >1:rnr< ae 
On the other hand, @; = |(—1,1)| =2 and |D| = |(0,1)| = 1. It follows that 


: Np(r) 1 QO) 2 1 
1 = D\| = : : 
ae pu ae aa! | 20 1 


The main lemma needed for the proof of 
Weyl’s theorem is a monotonicity result. 


Lemma 12.30. Let D, and D2 be nonempty 
bounded open subsets of R? with D, C D>. Then 
the corresponding Dirichlet eigenvalues, taking 
multiplicities into account, satisfy 


An,D, 2 An.D>: n 2 1. 


As a consequence, Np, (r) < Np,(r) for all r > 0. 


: : Hermann Weyl, 1885-1955 
Proof This follows from the Courant—Fischer 


theorem, observing that zero extensions of func- 
tions in H}(D)) belong to Hd (D2). 


The analogue of this lemma fails for Neumann boundary conditions. It is for this 
reason that we only present Weyl’s theorem for Dirichlet eigenvalues. The case of Neu- 
mann boundary conditions is discussed in the Notes to this chapter. 
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Proof of Theorem 12.29 For an open subset U of R¢ we denote the Dirichlet Laplacian 
in L?(U) by Av. 


Step 1 — The theorem is true if D = []4 ‘1 (4;,5j;) is an open rectangle. To prove this 
there is no loss of generality in assuming that a; = 0 for all j = 1,...,d. By the results 
of Example 12.23 and Section 3.5.c, the eigenfunctions for —Ap are the functions 


d 
(x) = []sin(njnxj/bj), x= (x1,...,xa) €D, (12.19) 


where n = (nj,...,%¢) with each n; in Ny := te EN: n> 1}. The corresponding eigen- 
values are the positive real numbers A, = 274 yal ny / bi. Hence, 


Sh 3 
NI] 


No(r) = #{ne Nf: be =}. 


As r — 9, this is asymptotic to 2-4-4 Al/ 201 TIj-1 bj, namely, a fraction | / 24 (the 
‘positive quadrant’) of the area enclosed by the ellipse yes x / bi = r/n?. Thus, 


Since Tif b; equals the area of D, this is precisely what we wanted to prove. 


Step 2 —If U; and U2 are disjoint open subsets of R? and U =U; UU is their union, 
then for all r > 0 we have 


Nu (r) = Nu, (r) + Nu, (r). (12.20) 


Indeed, f € L?(U) is an eigenfunction for —Ay, then for j = 1,2 the restriction f lu; 
is either an eigenfunction for —Ay, with the same eigenvalue or else it vanishes almost 
everywhere on Uj. If the first alternative happens for both j = 1,2, then f contributes 
2 to Na(r) (since we count multiplicities) and f|y, contributes 1 to Nr,(r) for j = 1,2. 
If the second alternative happens for either j = 1 or j = 2 (both cannot happen, for this 
would imply that f vanishes almost everywhere on U), then f contributes 1 to Nr(r) 
and one of the f|y, contributes 1 to Nr, (7) and the other one contributes 0. 

In the converse direction, if f; € L? (R; ) is an eigenfunction for the Dirichlet Lapla- 
cian on R;, then its zero extension to R is an eigenfunction for the Dirichlet Laplacian 
on R. This contributes | to Nr(r) and 1 to Nr,(r). 


Step 3 — By Steps 1 and 2 (applied inductively), the theorem holds for finite unions 
of open rectangles. Hence, if R,,...,R, are disjoint open rectangles contained in D and 
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R denotes their union, then, by Lemma 12.30, 


m NR) 


am ya/2 


<limint 2 
roo 


Oa R| = a. 


(27) 


Approximating D from within by such finite unions, we obtain 


Og ._. .Np(r) 
(ony4 |D| < liminf 4/2 (12.21) 
Let p > 0 be so large that D C (—p,p)4. By the same reasoning, 
es N_ d\p\" 
O_|(_p,p)4\D\< liming 8PM) (12.22) 


(2n)4 imi ra/2 


Using Step 1, (12.20), (12.21) and (12.22), and the assumption |0D| = 0, we obtain 


g — Np(r) 4. N-p.p)e\b(") 
nye 7" tae a are taint an Bos 
p)! = _ @q(2r)4 


It follows that we have equality throughout. Hence, using (12.20) and Step 1 once more, 


(og (2r)4 Np(r) N~p,p)’\D") 

(nye = liminf > “ap + lim nint 
2 igh s ND) d Np(r) 
= liminf aj2 (oma ”) iimsip “2 


and therefore the limit inferior and the limit superior are equal. It follows that the limit 


lim,s.0 a exists. Finally, if we had strict inequality (12.21), that is, if 


Np(r) 
rape 


ap < liminf —F>- 


(27) 


then in (12.23) the second inequality would be strict. 


If one imagines a bounded open set D in R@ as a ‘drum’, the eigenvalues of the neg- 
ative Dirichlet Laplacian on L7(D) can be interpreted as the ‘frequencies’ of the drum. 
This prompted the famous question of Mark Kac: “Can one hear the shape of a drum?”. 
In its mathematical formulation, the question is whether the shape of D, up to an isom- 
etry of R%, is determined by its sequence of frequencies. Without further assumptions 
on D, in general the answer is negative. Nevertheless, Weyl’s theorem implies that the 
volume |D| of D can be recovered from the spectrum. 


12.1 


12.2 


12.3 


12.4 


12.5 


12.6 


12.7 
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Problems 


Let (i,V,H) be a Gelfand triple and let A be the linear operator in H associated 
with a bounded accretive form a on V. Prove that the inclusion mapping from 
D(A) into V is bounded. 

Let (i, V,H) be a Gelfand triple. A form a on V is said to be elliptic if there exist 
A > 0 and a > 0 such that 


Rea(v,v) +Allv|l? > allyllp, ve. 
(a) Show that the form a on V is elliptic if and only if the form 
ag (u,v) = a(u,v) + (uly), u,veV, 


is coercive on V. 
(b) Prove a version of Corollary 12.13 for operators A associated with an elliptic 
form aonV. 


Let (i, V,H) be a Gelfand triple, let a be a coercive form on V, and let B€ .2(V,H) 
be bounded. Show that the form ag on V defined by 


ag(u,v) :=a(u,v)+(Bulv), u,veV, 


is bounded and elliptic. 

Revisiting the conditions imposed in the treatment of the Sturm—Liouville prob- 
lem in Section 11.3.b, let D C R¢ be open and bounded and let a: D— M,(C) be 
a function with bounded measurable coefficients such that 


d ims) 
Re Y° ai(x)Gi6, 2 alE?, E€C% 
ij=l 
for some & > 0 and almost all x € D. Let b: D— K¢@ have bounded measurable 
coefficients, and let b: D > K be bounded and measurable. Show that the form 


a(u, v) = [| avu- Wart | p-Yuvar+ | cuvdr, u,v € Hy(D), 
D D D 


is elliptic. 

Prove that the form a constructed in the proof of Proposition 12.16 has the fol- 
lowing minimality property: If a: V x V > C is a closed form extending the 
continuous accretive form a, then a extends @. 

Prove the following facts for d = 1 and D = (a,b): 


(a) D(Apir) = {f ¢ H°(D) : f(a) = f(b) = 0}; 

(b) D(Aneum) = {f € H?(D): f'(a) = f'(b) = 0}. 
Let V be a Hilbert space, let a: V x V > K be a bounded coercive form and let 
L:V — K bea bounded functional. 
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(a) Show that the energy functional 
E(x) := 500%) —ReL(x) 
is bounded from below. 


Fix a nonempty closed convex subset C of V. 


(b) Let (%:)n>1 be a sequence in C such that 


jim E (Qn) = inf E(x) = E> —o., 
Prove that this sequence is Cauchy in V. 


Hint: The convexity of C implies that 5 (Xn +Xm) € C. Then use the identity 


1 1 1 1 
Xn 71 Xm)) = 5 E (an) | 5E (xm) 8 


(c) Prove that x := lim,_;..X, is the unique element of C minimising E. 
(d) Compare this result with Problem 11.31. 


a(Xn Xm,Xn Xm) : 


12.8 We take a look at Example 12.23 from a Calculus perspective. 
(a) For which J € R does the problem 


ae on (0,1), 
u(0) =u(1) =0, 


admit a C-solutions? For these values of A, find all C2-solutions. 

(b) Do the same for Neumann boundary conditions u’(0) = u'(1) = 0. 

(c) Explain why this is not enough to determine the spectra of the Dirichlet and 
Neumann Laplacians in L7(0, 1). 


12.9 Provide the details of the proof that all Dirichlet eigenfunctions on a cube are 
given by (12.19). 
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Semigroups of Linear Operators 


In this chapter we set up a functional analytic framework for the study of linear and non- 
linear initial value problems. This includes the treatment of parabolic problems such 
as the heat equation and hyperbolic problems such as the wave equation. From the 
operator-theoretic perspective the main challenge is to arrive at a thorough understand- 
ing of linear equations. This is achieved through the theory of Co-semigroups developed 
in the present chapter. Once this is done, nonlinear equations are handled by perturba- 
tion techniques. 


13.1 Co-Semigroups 


Linear equations of mathematical physics describing systems involving time evolution 
can often be cast in the abstract form 


u(t) =Au(t)+f(t,u(t)), t€ [0,7], 
u(O) = uo, 


where the unknown is a function u from the time interval [0,7] into a Banach space X, 
the operator A is a linear, usually unbounded, operator acting in X, f : [(0,T] x X > X is 
a given function, and the initial value uo is assumed to be an element of X. This initial 
value problem is referred to as the abstract Cauchy problem associated with A and f. In 
applications, typically X is a Banach space of functions suited for the particular problem 
and A is a partial differential operator. For instance, for the heat equation on a bounded 
open set D of R@ subject to Dirichlet boundary conditions one could choose X = L?(D) 
and take A to be the Dirichlet Laplacian studied in the previous chapter. 

If A is a bounded operator, the unique solution u of the linear abstract Cauchy problem 


Me =Au(t), 1 € [0,7], (ACP) 
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is given by 
co yn 
u(t) = e4u9 = > oA Hos t € [0,7]. 

n=0°"" 
The operators e’4 may be thought of as ‘solution operators’ mapping the initial value 
ug to the solution e’4ug at time t. For unbounded operators A this simple strategy does 
not work since we run into convergence and domain issues. In the case of selfadjoint 
operators A and, more generally, normal operators A acting in a Hilbert space, one could 
instead use the functional calculus of Chapter 10 to define the exponentials e’4. This 
would still limit the scope and applicability of the theory considerably. In order to set 
up a more general and flexible framework we take a more abstract approach which is 
motivated by the properties of the exponentials e’4 for bounded operators A: they satisfy 
4 = J and ee = e(+5)4, and the mapping t + e’4 is continuous with respect to the 
operator norm. 


13.1.a Definition and General Properties 


Throughout this chapter, X is a Banach space and H is a Hilbert space. 
The preceding discussion suggests the following definition. 


Definition 13.1 (Co-Semigroups). A family S = {S(t)},50 of bounded operators acting 
on X is called a Co-semigroup if the following three properties are satisfied: 

(Sl) SO) = 

(S2) (semigroup property) S(t)S(s) = S(t +) for all t,s > 0; 

(S3) (strong continuity) lim,)9 ||S(¢)x — x|| = 0 for all x € X. 


Its infinitesimal generator, or briefly the generator, is the linear operator A defined by 


D(A) = {x X: lim : (S(t)x — x) exists inx}, 


The idea is to interpret the orbit 
u(t) := S(t)uo 


as ‘the solution’ of the linear problem (ACP). To find a precise way to make this idea 
rigorous, and to subsequently cover also nonlinear initial value problems, is among the 
main objectives of this chapter. 


Remark 13.2 (Strong convergence versus uniform convergence). The properties of e'4 
suggest replacing (S3) by the stronger condition lim, ||S(¢) — /|| = 0. As it turns out, 
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however, this condition forces the generator A to be bounded (see Problem 13.2). This 
renders the theory useless, as it would fail to cover equations in which A is a differen- 
tial operator acting in Banach space X of functions. In a sense the strong convergence 
imposed in (S3) is also more natural, as it gives the continuity with respect to the norm 
of X of the ‘solution’ u(t) = S(t)uo (see Proposition 13.4). 


The next two propositions collect some elementary properties of Co-semigroups and 
their generators. 


Proposition 13.3. Let S be a Co-semigroup on X. There exist M > 1 and @ € R such 
that ||S(t)|| < Me for all t > 0. 


Proof There exists a number 6 > 0 such that sup,<(9,g) ||S(¢)|| =: © < oe. Indeed, oth- 
erwise we could find a sequence f, | 0 such that lim, ||$(tn)|| = °°. By the uniform 
boundedness theorem, this implies the existence of an x € X such that sup, 5; ||S(¢n)x|| = 
co, contradicting the strong continuity assumption (S3). 

By the semigroup property (S2), for t € [(k — 1)5,k6] it follows that ||S(t)|| < o* < 
o!+t/5, where the second inequality uses that o > 1 by (S1). This proves the proposi- 
tion, with M =o and @ = z log oO. 


We will frequently use the trivial observation that if A generates the Co-semigroup 
(S(t))rs0, then for all scalars 2 the linear operator A — U generates the Co-semigroup 
(e-“'S(t)):50. For > @ this rescaled semigroup has exponential decay in operator 
norm. 


Proposition 13.4. Let S be a Co-semigroup on X with generator A. The following as- 
sertions hold: 


(1) for all x € X the orbit t ++ S(t)x is continuous for t > 0; 
(2) for all x € D(A) the orbit t + S(t)x is continuously differentiable for t > 0, we have 
S(t)x € D(A), and 


(3) for all x € X andt > 0 we have fj, S(s)xds € D(A) and 


t 
A| S(s)xds = S(t)x —x, 
0 
and if x € D(A), then both sides are equal to {j, S(s)Axds; 


(4) the generator A is a densely defined closed operator. 


Proof The proof uses the calculus rules for Banach space-valued Riemann integrals 
(Proposition 1.45). 


(1): Right continuity of t 4 S(t)x follows from the right continuity at t = 0 (S3) and 
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the semigroup property (S2). For left continuity, observe that by the semigroup property, 
for 0 <h <t we have 


| S@)x—S(¢— Apa <||S(¢—A)||S(H)x— al] < sup |}S(s)|||}S(H)x— al, 


s€(0,1] 
where the supremum is finite by Proposition 13.3. 


(2): Fix x € D(A). By the semigroup property we have 


lim L(S(¢-+h)x—S(¢)x) = S(t) lim 2 (S(n)x—x) = S(DAx. 


This proves all assertions except left differentiability. For t > 0 we note that 


lim Z 7 (Slt —h)x— S(t)x) = lim (eh) (= (s(h)x—x)) = S(t)Ax 


where we used that x € D(A) and the fact that the convergence limy)9 S(t — h)y = S(t)y 
for all y € X implies convergence uniformly on compact sets by Proposition 1.42. 


(3): The first identity follows from 


fia =) [sb )xdy = tim (fs (s+h) )rds— [S(s) )xds) 
hio A 
1 t+h 
Bers S(s)xds— f - 
= S(t)x- 


where we first did a substitution and then used the continuity of t > S(t)x. The identity 
for x € ay follows by integrating the identity of cy (2), or by noting that 


ie =n [ss )ads = lim — ; | 500) h)x —x) ))ar= f' sts )Axds, 
him (8 hilo h 


where the convergence under the integral is justified by the fact that the convergence 
of the difference quotients (5 (h)x — x) to Ax implies uniform convergence of the inte- 
grands on (0,t]. 

(4): Denseness of D(A) follows from (1) and the first part of (3): for any x € X, 
the latter implies that {}S(s)xds € D(A) for all t > 0, while the former implies that 
lim,jo ¢ fo S(s)xds = x. 

To prove that A is closed we must check that the graph G(A) = {(x,Ax): x € D(A)} 
is closed in X x X. Suppose that (x, )n>1 is a sequence in D(A) such that lim,—,..X%, =x 
and limyj—..AxX, = y in X. Then, by the second part of (3), 

1 _ 1 1 ft 1 fh 
(S(h)x =x) = lim >(S(h)y—2n) = lim = | 5(s)Axnds = > [ S(s)yds. 


h n-oo h n—-oo 0 


Passing to the limit for h | 0, this gives x € D(A) and Ax = y. 
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We have just seen that the generator of a Co-semigroup is always densely defined 
and closed. As a consequence of the latter, D(A) is a Banach space with respect to its 
graph norm. In various applications it is of interest to know when a subspace Y, which 
is dense in X and contained in D(A), is dense as a subspace of D(A). If this is the case, 
Y is called a core for A. The next result gives a simple sufficient condition. 


Proposition 13.5. Let S be a Co-semigroup with generator A on X. If Y is a subspace 
of D(A) which is dense in X and invariant under each operator S(t), t > 0, then Y is 
dense in D(A). 


Proof The operator A — A is the generator of the Co-semigroup (e~*/S(t));s0. Hence, 
by the exponential boundedness of S, replacing A by A — A for sufficiently large A > 0 
we may assume that lim,_,.. || S(t) || = 0. 
Fix x € D(A) and choose a sequence (y,)n>1 in Y such that lim,—,..Y, = Ax in X. Fix 
t > 0. Then 
t t 
lim | S(s)y,ds= i: S(s)Axds = S(t)x—x 
neo J 0 
in X and 
t 
lim A | S(s)ypds = lim S(t)¥n — Yn = S(t)Ax — Ax. 
0 noo 


n—-oo 


It follows that 
t 
lim | S(s)y,ds=S(t)x—x in D(A). 


noo Jo 

The identity 

I|S(t)x — xl] D(a) = [Sx — a] + |]S()Ax— Art| 
implies that the restriction of S to D(A) is strongly continuous with respect to the graph 
norm of D(A), and for this reason we may approximate the integrals by Riemann sums 
in the norm of D(A). By the invariance of Y under S, these Riemann sums belong to Y. 
It follows that for each t > 0 and € > 0 there is ay; ¢ € Y such that 

I|(S()x —x) — yrellpiay < €- 

As t + 29, ||S(t)x||pa) = || S(4)x|| + ||S(t)Ax|| — 0, and therefore, for large enough r > 0, 


lly, —x||p(a) gE+ I|S(t)l| D(a) < 2€. 


This shows that x can be approximated in D(A) by elements of Y. 


This proposition is often helpful in determining the domain of the generator explicitly 
when the semigroup is given; see for instance Section 13.6.b. 

The proof of the next proposition uses the following version of the product rule. It is 
proved in the same way as the product rule in calculus; uniform convergence on compact 
sets follows from Proposition 1.42. 
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Lemma 13.6. Let I C R be an open interval and let S: 1 > @(X) andT :I > L(X) 
be strongly continuous functions. Let to € I and x € X be fixed. If 


(i) tts S(t)x is differentiable at to, with derivative 


d 
dt aig’ (t) = S'(to)x, 
(ii) t'4 T(t)S(to)x is differentiable at to, with derivative 
d 
Gy ltato 2 (1)S(t0)" =: T' (to) S(t0)x, 
then t ++ T (t)S(t)x is differentiable at to, with derivative 
d 
a T (t)S(t)x = T' (to) S(to)x + T (t0)S' (to) x. 
t=I9 


Proof Fort €1\ {to} we have 
T (t)S(t)x — T (to)S(to)x 
t—to 
a T (t)S(t)x — T (t)S(to)x 7 T (t)S(to)x — T (to) S(to)x 
t—to t—to 
=(7)+(). 


By assumption, (II) tends to T’(to)S(to)x as t > to. Concerning (1), fix 6 > 0 small 
enough so that (f9 — 6, f9 +6) is contained in J, set I,, 5 := (to — 6,10) U(to,f0 + 6), and 
consider the relatively compact set 


S(t)x — S(to)x 
Cet es S'(to)x: 1 Elys}. 


For t € I,,5 we have 


| MOstox= wits 7 (to)S'(to)>|| 
< [ro (ES — 9()2) | +1 -T00)5' Ceo) 
See I7@y-To)vll + [7 0) pe —S'(to)x)| 


+ (T(t) = T(t0))S"(to)a 


The strong continuity of T and the fact that strong convergence implies strong conver- 
gence on compact sets imply that the first and third terms on the right-hand side tend to 


0 as t — fo. The second term tends to 0 by the assumptions on S. 


A Co-semigroup is uniquely determined by its generator: 


13.1 Co-Semigroups 433 


Proposition 13.7. [fA is the generator of the Cy-semigroups S and T, then S(t) = T(t) 
for allt > 0. 


Proof By Lemma 13.6, for all t > 0 and x € D(A) the function @;(s) := S(t —s)T(s)x 
is continuously differentiable on [0,t] with derivative $/(s) = —AS(t — s)T(s)x+S(t— 
s)AT (s)x = 0, and therefore @, is constant by Proposition 1.45. Hence, S(t)x = ,(0) = 
¢,(t) = T(t)x. This being true for all x in the dense subspace D(A) of X, it follows that 
S(t) =T(t). 


The next proposition identifies the resolvent of the generator as the Laplace transform 
of the semigroup. 


Proposition 13.8. Let A be the generator of a Co-semigroup S on X, and fix constants 
M > Land @ € R such that \|S(t)|| < Me for allt > 0. Then {2 €C: ReA > @}C 
p(A), and on this set the resolvent of A is given by 
R(A,A)x = | e*s(t)xdt, xe X. 
0 
As a consequence, for Red > @ we have 
M 
R(A,A)|| < —.—_.. 
IRQ.A)| < oo 
Proof Fixx €X and define Ryx := fo’ e~*S(t)xdt. Using the semigroup property (S2) 
and a substitution, we obtain the identity 


Lim = (S(l) — Rx = lim > (el | eS (t)xdt — i) eM S(t)xdr) 


=AR,x—x, 


from which it follows that Rax € D(A) and AR,x = AR,x — x. This shows that the 
bounded operator Ry is a right inverse for A — A. 

Integrating by parts and using that £S(t)x = S(t)Ax for x € D(A) we obtain, for 
x€ D(A), 


: Xd A - Xd 
a | e “'S(t)xdt = —e~ *s(r)x+x+ f e “'S(t)Axdr. 
0 0 


Since Red > @, sending T — © gives AR, x =x+R,Ax. This shows that Ry is also a 
left inverse. 
The estimate for the resolvent follows from 


co. co co M 
Lf e*s(oyxail| < [ e-ReMis(a)x|| ar < Mila f el REA dy = Ila 
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Combining this result with Proposition 10.30 we obtain the result that a Co-semigroup 
is determined by its generator: 


Proposition 13.9. [fA and B generate Co-semigroups on X, and if B is an extension of 
A, then A = B. 


Proof Proposition 13.8 implies that the resolvent sets of A and B share a common 
half-plane. The equality A = B then follows from Proposition 10.30. 


For operators satisfying the resolvent estimate of Proposition 13.8 we have the fol- 
lowing convergence result. 
Proposition 13.10. Let A be a densely defined closed operator acting in X, and suppose 
that for some @ € R we have {A > w} C p(A) and 

M 
R(A,A)|| < ——, A>a@. 
IRQ.A)Il < 

Then for all x € X we have 


lim AR(A,A)x = x. 
Aveo 


Proof First let x € D(A) and fix an arbitrary fp € p(A). Then x = R(w,A)y for y := 
(uu — A).x. By the resolvent identity and the above estimate on the resolvent we obtain 


lim AR(A,A)x = lim 2R(A,A)R(Ut,A)y 
A000 Avo 
: A 
r= fo TT atc —R(A,A))y = RU A)y = x. 


For general x € X the claim then follows by approximation with elements from D(A), 
using the uniform boundedness of the resolvent for ReA > @+1. 


The final result of this section gives a useful sufficient condition for a semigroup of 
operators to be strongly continuous. We need the following terminology. A family of 
bounded operators S = S(t));+0 on X is said to be a weakly continuous semigroup if 
conditions (S1) and (S2) in Definition 13.1 hold and (S3) is replaced by the condition 
that for all x € X and x* € X* one has lim,)9(S(t)x,x*) = (x,x"*). 


Theorem 13.11 (Phillips). Every weakly continuous semigroup is strongly continuous. 
Proof Let 

Xo = (xX: lim |S(¢)x—x|] = 0}. 
It is evident that Xo is a linear subspace of X. We wish to show that Xp = X. 


By Proposition 5.5 the family {S(t)x : 0 <r < 1} is uniformly bounded for every 
x € X, and by the uniform boundedness theorem (Theorem 5.2) this implies that the 
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family {S(t) : 0 <t < 1} is uniformly bounded. A first consequence is that Xo is a 
closed subspace of X. Next we note that the weak continuity of f +> S(t)x along with 
the fact that closed subspaces are weakly closed (Proposition 4.42) implies that each 
orbit t ++ S(t)x is contained in a separable closed subspace of X. It follows that we can 
apply the Pettis measurability theorem (Theorem 4.19) and conclude that every orbit 
t+ S(t)x is strongly measurable. It follows from these considerations that the Bochner 
integrals x; := t J S(s)xds are well defined. 
Fixx € X and0<t< 5. For0<s<t, 


||S(s)x, — x] = “| [56 + r)xdr ['stxar 
: [scorer [scorer 


t 


1 
_ <2s- -( sup Isl) I|xll- 


O<r<l 


This shows that x7 € Xo. 

Suppose now, for a contradiction, that Xy 4 X. Then there exists an x € X \ Xo and by 
the Hahn—Banach theorem we can find an x* € X* which vanishes on Xo but not on X. 
Then, with x, = u fs S(s)xds as before, 

1 t 


0= lim (1x = ie i (S(s)x,x*) ds = (x,x*) £0, 


a contradiction. 


13.1.b Co-Groups 


Instead of considering only forward time we could also include backward time. This 
leads to the notion of a Co-group. 


Definition 13.12 (Co-groups). A Co-group is a family S = (S(t)):eR of bounded opera- 
tors acting on X with the following properties: 

(G1) S(0) =J; 

(G2) S(t)S(s) =S(t+s) for all t,s € R; 

(G3) lim,-,o ||S(t)x —x|| = 0 for all x € X. 


Its infinitesimal generator, or briefly its generator, is the linear operator A defined by 


D(A) := {xe X: lim : (S(t)x — x) exists}, 


t0 ¢ 


Hee tie” (Sex). Ee DA). 
to0 t 


It is evident from the definition that if A generates a Co-group (S(t));er, then both 
(S(t))r50 and (S(—t)):50 are Co-semigroups. Denoting their generators by A+ and A_, 


436 Semigroups of Linear Operators 


it is evident that D(A) C D(A) MN D(A_) and that for all x € D(A) we have Ax = A+x = 
—A_x. In fact, more is true: 


Proposition 13.13. A linear operator A in X generates a Co-group (S(t));er if and 
only if both A and —A generate Co-semigroups. These semigroups are (S(t))r>0 and 
(S(—t))r50, respectively. 


Proof If A generates a Co-group (S(f));eR and x € Bae then 


1 
lim||-(s t)x — x) )-Asa| =lim| — -S(-1) “sls JAyxds—Ayx|] =0 


rt0 rt0 ‘ee 


Since also lim,\9 + (S(t)x —x) =A, x it follows that x € D(A) and Ax = Ax. In com- 
bination with the inclusion D(A) C D(A) it follows that D(A) = D(A+) and therefore 
A =A,. In the same way one proves that D(A) = D(A_) andA = A_. 

For the converse, suppose that A and —A generate Co-semigroups (S+(t)):50 and 
(S_(t)):50 respectively. By Lemma 13.6, for x € D(A) = D(—A) the function t 
S_(t)S(t)x is continuously differentiable and 


a (t)S4(t)x = —AS_(t)S+(t)x+S_(t)AS, (t)x = 0, 


where we used that S(—t) commutes with A. It follows from Proposition 1.45 that the 
function t+ S_(t)S4(t)x is constant, and evaluation at t = 0 shows that S_(t)S4(t)x=x 
for all ¢ > 0. Since D(A) is dense this identity extends to arbitrary x € X. This proves 
that S_(t) is a left inverse for S(t). Interchanging the roles of S_(ft) and S;(t) we 
find that S_(t) is also a right inverse for S(t). As a result, S;(t) is invertible and 
(S,(t))~! = S_(t) for all t > 0. Fort € R define 


S(t):= Si(t), 120, 
BUA. 820: 


With what we have proved it is trivial to verify that (S(t));>0 is a Co-group and that A is 
its generator. 


Proposition 13.8, applied to the semigroups generated by +A, implies: 


Corollary 13.14. [fA generates a uniformly bounded Co-group on X, then o(A) C iR. 


The spectrum of the generator of a Co-semigroup may be empty (an example is given 
in Problem 13.4). This is contrasted by the second part of the following result. For a 
uniformly bounded Co-group S on X and f € L'(R) we define S(f) € Y(X) by 


(fei f()S(Hxdt, xEX. (13.1) 


Theorem 13.15. [fA generates a uniformly bounded Co-group S on X, then: 
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(1) if the Fourier transform of a function f € L! (R) is compactly supported and van- 
ishes in a neighbourhood of io (A), then S(f) = 0; 


(2) ifX # {0}, then o(A) 42. 


Proof (1): For all 6 > 0 and s € R we have +6 — is € p(A), and for all x € X we have 
the identities 


R(8—is,A)x = | * 6 (8-1) $0) de 
0 


and 


R(—6—is,A)x = —R(S-+is,—A) = fc @rs(—near. 
0 


Hence by dominated convergence, Fourier inversion (Theorem 5.21), Fubini’s theorem, 
and Propositions 13.8 and 13.13, 


s(x lim Es ell Fr) S(n)xdt 


7 maim i etl / e'" f(s) ds) S(t)xdt 


i pe 
ern ae a (f 
elim / _ Fls)(R(6 =is,4) —R(—8 —is,A)) 


By dominated convergence, this identity immediately implies (1). 


eo ll el S(t) x dr) ds 


) 


(2): Suppose that o(A) = @. The result of part (1) implies that S(f) = 0 for all 
f € L!(R) whose Fourier transform has compact support. We claim that such func- 
tions are dense in L'(IR). To see this, fix an arbitrary nonzero function @ € C>(R). Its 
inverse Fourier transform y := @ belongs to L'(R) (since ¢ € L!(R) implies that 
|x|*y(x) is bounded for all k € N). Since y is nonzero by the injectivity of the (in- 
verse) Fourier transform, after multiplying with an appropriate scalar we may assume 
that fg ydx = 1. By Proposition 2.34 we then have lime We « f = f in L(IR), where 
We(x) :=€ “(e7 |x), and the Fourier transforms Wexf = V2nWef are compactly sup- 
ported. This proves the claim. 

By approximation we obtain that S(f) = 0 for all f € L'(R). In particular, taking 
fo(t) :=e fort > 0 and fo(t) := 0 for t < 0, Proposition 13.8 implies that R(1,A) = 
S(fo) = 0. Since R(R(1,A)) = D(A) is dense in X, this implies that X = {0}. 


We will use this theorem to give a proof of Wiener’s Tauberian theorem (Theorem 
5.22). Recall that this theorem asserts that if the Fourier transform of a function f € 
L(R) is zero-free, then the span of the set of all translates of f is dense in L!(R). 
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We start with some preparations. If S is a uniformly bounded Co-group on a Banach 
space X, we define 


Is :={f €L'(R): S(f) = 0}, 
where S(f) is given by (13.1). The Arveson spectrum of S is the set 


Sp(S):={@ ER: f(@) =0 forall f Is}. 
The key to proving Wiener’s Tauberian theorem is the following result, which is of 
independent interest. 


Theorem 13.16. Let S be a uniformly bounded Co-group S with generator A on X. Then 
Sp(S) = io(A). 


Proof First let @ € R satisfy @ ¢ io(A). Noting that o(A) C iR, we choose a function 
f € L'(R) whose Fourier transform is compactly supported and vanishes in a neigh- 
bourhood of io(A) but not on @. By Theorem 13.15, S(f) =0, so f € Js. But then 
f(@) £0 implies that @ ¢ Sp(S). 

Conversely, let @ € io(A). Since o(A) C iR and since the topological boundary of 
o(A) is always contained in the approximate point spectrum (see Section 10.1.c, where 
it was observed that the corresponding result for bounded operators, Proposition 6.17, 
extends to unbounded operators), —i@ is contained in the approximate point spectrum of 
A. Hence we may choose a sequence (x;,),>1 of norm one vectors in X, with x, € D(A) 
for all n > 1, such that lim)... ||Ax, +i@x,|| — 0. In view of 


S(t)Xp — @ Oxy = di e' S(s)(A+i@)x,ds > 0 asn—, 
0 
(X»)n>1 is an approximate eigensequence of S(t) with approximate eigenvalue e~!'. 
Let f € L'(R). By dominated convergence, 


lim a z F(t)(S(t)xn — e"x,) dt = 0. 


n—-0o 
Thus, using that ||x,|] = 1, 


[S(A)||> fim |S Paull = fim] [7 F)SC)nal| =| [ e-"fe)at| = flo). 


n~ 


This inequality implies that f(@) = 0 for all f € Js. Therefore, @ € Sp(S). 


The right translation group is the Cy-group U = (U(t));eR on L'(R) defined by 
U(t)f(s):=f(s-t), 9, ER. 


Note that U(f)g = f *« g for all f,g € L'(IR), where « denotes convolution. 
We are now ready for the proof of Wiener’s Tauberian theorem. 
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Proof of Theorem 5.22. Let f € L'(R) be a function whose Fourier transform is zero- 
free and let X := span{U(r)f : t © R}. We wish to prove that X = L!(IR). Consider 
the quotient space Y := L'(IR)/X and let Uy = (Uy(t));eR denote the associated quo- 
tient translation group on Y. This group is strongly continuous and bounded. For all 
g € L'(R) we have U(f)g = fxg =g* f =U(g)f. By the translation invariance of 
X, U(g)f € X. Hence U(f)g € X, so Uy(f)(g +X) =0 for all g € L'(R). It follows 
that Uy(f) = 0. On the other hand, by assumption we have f(@) #0 for alla eR. 
Therefore, Sp(Uy) = @. We conclude that Y = {0} and X = L'(R). 


13.2 The Hille-Yosida Theorem 


The main generation theorem for Co-semigroups is the Hille—Yosida theorem, which 
gives necessary and sufficient conditions in terms of resolvent growth. We only need 
the version for contraction semigroups, which is somewhat easier to state and prove. Its 
extension to general semigroups can be done via the same reductions that will be used 
in the proof of Theorem 13.18 (see Problem 13.1). 


Theorem 13.17 (Hille—Yosida). For a densely defined closed linear operator A in X 
the following assertions are equivalent: 


(1) A generates a Co-semigroup of contractions on X; 


(2) {A EC: Red > 0} C p(A) and 


\|R(A,A)|| < Red > 0; 


1 
Red’ 
(3) {A ER: A >0} C p(A) and 


|R(A,A)|| < a A>0. 


Proof The implication (1)=(2) follows from Propositions 13.4 and 13.8, and the im- 
plication (2)=>(3) is trivial. 

Assume now that (3) holds. For the bounded operators A, := nAR(n,A) =1n?R(n,A) — 
nl, n > 1, Proposition 13.10 implies lim,_,..A,x = Ax for all x € D(A). Also, 


ile” |] < et lR@A)||t o—nt <e@e™ = 1, (13.2) 
Fix x € D(A) and t > 0. The identity 
t d t 
efAny — efAmy — if —[el—)Am o5An x] ds = a elt s)Am en (Anx — Amx) ds 
0 ds 0 
and the contractivity estimate (13.2) imply 


lle"Anx— el4max|] <t||Anx— Amal, 
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and therefore (e’4"x),>1 is Cauchy in X for all x € D(A). Hence, the limit S(t)x := 
lim,,.. e’4"x exists for all x € D(A). By the uniform boundedness of the operators e’4” 
guaranteed by (13.2), this limit in fact exists for all x € X. Moreover, for each t > 0 
the resulting mapping x +> S(t)x is linear and contractive. It remains to verify that the 
contractions S(t), ¢ > 0, form a Co-semigroup on X and that A is its generator. 

It is clear that S(O) = J. The semigroup property follows from 

S(t)S(s)x = lim e4"e"x = lim el +)4ny = $(t + 5)x, 
n—y0o n—-eo 

using the uniform boundedness of the sequence (e'4”),,5, in the first equality and the 
properties of the power series of the exponential function in the second. 

Next we prove the strong continuity. For x € D(A) we have 


t t 
S(t)x—x = lim e4"x—x= lim | e™"Anxds = i S(s)Axds, 
0 


n—-oo no Jo 


where we used that 


le*"Anx—S(s)Aal] < lle" (Ant —Ax)]| + l|(e" —S(5) Aa 
| 


< 
< ||(Anx —Ax)|] + [](e%" — S(s))Ax|] + 0. 


Therefore, for x € D(A), 
t 
lim S(t)x-—x = lim [ S(s)Axds = 0. 
110 110 Jo 


Once again the strong continuity for general x € X follows from this by approximation. 
It remains to check that A equals the generator of S$, which we denote by B. By what 
we have already proved, for x € D(A) we have 


1 1 
lim —(S(t =li SG 
eM) or x) sea A (s)Axds = Ax, 


so x € D(B) and Bx = Ax. Since both A and B are closed and share a half-line in their 
resolvent sets, Proposition 10.30 implies that A = B. 


As an application we have the following perturbation result. 


Theorem 13.18 (Perturbation). Let A be the generator of a Co-semigroup S on X and 
let B be a bounded operator on X, then A+ B generates a Co-semigroup on X. 


Here it is understood that D(A + B) = D(A) and (A + B)x = Ax+ Bx for x € D(A). 
The proof of the theorem shows that if ||S(r)|| < Me®, then ||Sp(t)|| < Me(@* IBID)", 


Proof We prove the theorem in three steps. We begin with two reductions. 


Step 1 - Choose M > 1 and @ € R such that ||S(t)|| < Me for all t > 0. The operator 
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A—@ is the generator of the Cp-semigroup (e~ ’S(t));>0, and this semigroup satisfies 
le“ S(t) || < M. Since 


A+B=(A-—@)+(B+@) 


and B+ @ is bounded, this argument shows that it is enough to prove the theorem for 
uniformly bounded semigroups. 


Step 2 — We now assume that the semigroup generated by A is uniformly bounded, 
say by aconstant M > 1. From 


[ll] < sup ||S(#)-|| < Mla] 
1>0 
it follows that 
[xl] == sup ||S(¢)-x| 
1>0 
defines an equivalent norm on X. With respect to this norm, for all x € X we have 
[|S(s)|l| = sup ||S(s +2)x|| < sup ||S(r)-]] = |lhl- 
t>0 r>0 


This argument shows that we may assume that our semigroup is a semigroup of con- 
tractions. 


Step 3 — By the previous two steps it suffices to prove the theorem for generators of 
contraction semigroups. 
Fix A € C with Red > 0. Then A € p(A) and ||R(A,A)|| < (ReA)~ |. Because 


(A —(A+B)) = (I-— BR(A,A))(A —A) 


and ||BR(A,A)|| < (ReA)~|||B||, for ReA > ||B|| the operator /— BR(A,A) is invertible, 
and the Neumann series for its inverse gives 


||R(A,A)||||(0— BR(A,A)) "|| < (ReA) "(0 — (Rea) *|B||)~* = (Rea — [|B]. 
Hence, for Red > ||B||, the operator A — (A+ B) is invertible and its inverse satisfies 
|R(A,A+B)|| < (ReA — |||)". 
The operator A + B — ||B|| is then invertible for ReA > 0 and satisfies 
I|(A — (A+B — |[BI|)) || < (Rea). 


By the Hille—Yosida theorem this operator generates a Co-semigroup T of contractions. 
Then A + B generates the Cy-semigroup given by Sg(t) := etl T(t). 

Clearly, ||Sg(t)|| < e'!!l|. Remembering that we made two reductions, reversing them 
gives the estimate given after the statement of the theorem. 
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In general there is no closed-form expression for Sg(t), but we do have the so-called 
variation of constants identity 


Sp(t)x = S(t)x+ [ " S(t — s)BSp(s)xds. 


The proof of this identity is simple: for x € D(A) = D(A+ B), using Lemma 13.6 we 
differentiate the function @(s) = S(t — s)Sg(s)x using the product rule and get 


'(s) = —AS(t — s)Sp(s)x +S(t — s)(A + B)Sp(s) = S(t — s)BSp(s)x. 


Integrating this identity over the interval [0,7] gives the required result. 
As a consequence of this identity we see that the norm of the difference is of the order 


I| S(t) — Sa(t)|| = O(¢) as tL 0. 


We continue with a useful approximation formula, by means of which it is possible 
to deduce information about the semigroup from information about the properties of the 
resolvent along the positive real line. It will be used later on to prove the positivity of 
the heat semigroup under Dirichlet and Neumann boundary conditions. 

To motivate the result we recall Euler’s formula for the exponential, which entails 
that for alla € R andt > 0, 

t —n n 
e = lim (1 = “a) = lim (Fé -a)') 
noo n n>o\t ft 
Theorem 13.19 (Euler’s formula). Let A be the generator of a Co-semigroup S on X. 
Then for all x € X and t > 0 we have 
. (nan n 
S(o)x = lim (2R(,A)) x, 
Proof By Proposition 10.28 the resolvent of A is holomorphic with complex derivative 
given by AR(A,A) = —R(A,A)°. By induction this implies 
d” 


qa R(A.4) = (—1)"n!R(A,A)"*) (13.3) 


On the other hand, repeated differentiation under the integral in the Laplace transform 
representation R(A,A)x = Joe *°S(s)xds gives 
d” 
dA” 


R(A,A)x = (—1)"8" [ ” eASS(s)xds. 


Substituting s = rt and specialising to A = n/t we obtain 
d” 
da” 


R(A,A)x ai = Gia (re~")"S(rt)xdr. (13.4) 
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Combining (13.3) and (13.4), and using the identity 


nit co 
fo venar=1. (13.5) 
n! Jo 
we arrive at 
41 n+1 poo 
(FR(7,A))" x—S@)xr=" i: (re")"S(rt)xdr — S(t)x 
t ‘t n! Jo 
ntl 


= [ ceyser)x— s(x) ar 


Fixing x € X and € > 0, by strong continuity we may choose 0 <a< 1<b<insuch 
a way that 
sup ||S(rt)x— S(t)x|| <e. 
axr<b 
We split the integral into three parts I), 4, and J; corresponding to [0,a], [a,b], and 
[b,co) and estimate each part separately, using that u++ we~“ is increasing on [0, 1] and 
decreasing on [1,°°). For the first integral, using the elementary bound 


n A 


n e 


< 
nt V2amn 


(13.6) 


we obtain 


n+1 


a 
lal <= Cae)" [Cex S@xl|ar < 


1 
n! aanie"(ae™*)"-2 sup ||S(s)|||l-l, 


O<s<t 


which tends to 0 as n > © since ae~“ < e~ |. Next, using (13.5), 


n+1 ntl 


n 
lDI| < 


b oo 
| (re~")"edr <€ i (re~")"dr=e. 
a 0 
To estimate J; we choose M > | and @ € R such that ||S(t)|| < Me®’. Choose0< 6 < 1 
so small that be!~&C-§) < 1; this is possible since be~> < e7!. For alln > 61d + 
|| max{t, 1}), by (13.6) we have 


n! n! 


n+l poo 


Ill 2 — Pier TAO n grt elma ND yg (ge +e") ||x|| dr 
ct oman f(r —8))e--8))"er 
S al (6) lai f (r(1—8))e re"dr 


@ .-ifa7e pti—8) vn I -r 
< ——=n''*(be -2M||x e 'dr 
oe ( ) I|>| Fe 


1 
< 


nil? (be!-P(1-8) yn : 2M||x\|, 


S 
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which tends to 0 as n — ce by the choice of 6; we used the monotonicity of ut ue“ 
on [1,0) to bound (r(1 — 8))e~"-®) by (b(1 — 8) et), 
Collecting the estimates, we have shown that 
: noon n+l 
tim (2R(2,A)) "x= S()x, 


This is almost the result we want, except for the power n+ 1 instead of n. To correct for 
this we argue as follows. By Proposition 13.8, 


non n+l non n non n7n_ on 
"a(.a))e—("at" a) a (Pac) (Ea a2) 
IG e ) - t 4 ) e t P ) t G ye 3 
t —n 
<M(1--|al) “R(=,A)x—3. 
n t t 
As n —> ©, by Euler’s formula and Proposition 13.10 we have 
t —n 
(1--\)) > lol, |=R(=,A)x—-l| 0, 
n t t 
and therefore 
. n n n n+l 
Jim (FRCE.A)) x= fim (FR(7.A)) = Ste)x 


We conclude this section with a simple result about compact semigroups. 


Proposition 13.20 (Compact semigroups). Let A be the generator of a Co-semigroup S 
on X. If S(t) is a compact operator for every t > 0, then: 


(1) the semigroup is uniformly continuous for t > 0; 

(2) the resolvent operator R(A,A) is compact for every A € p(A); 

(3) the spectrum of A is finite or countable and consists of isolated eigenvalues, and the 
corresponding eigenspaces are finite-dimensional; 

(4) for allt > 0 we have the spectral mapping formula 


o(S(t)) \ {0} = exp(to(A)). 
Moreover, the eigenspaces corresponding to A € o(A) and e € o(S(t)) coincide. 


Proof (1): Fix s > 0. Fort > s/2 we have 
|| S(t) — S(s)|| = sup ||(S(¢—s/2) — S(s/2))S(s/2)a\]. 


I|x|<1 


Since S(s/2)By is relatively compact, by Proposition 1.42 this implies lim;-;s ||S(t) — 
S(s)|| =0. 
(2): Choose M > 1 and @ € R such that ||S(t)|| < Me® for all t > 0. We first claim 
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that the compactness of the semigroup operators S(t) for f > 0 implies that R(u,A) is 
compact for all Re > o. For all x € X and t > 0 we have 


t 
S()R(H.A)x— R(W.A)x = | S()AR(H.A)x, 
0 
and therefore 


[S()R(H.A) —R(H,A)|] < sup ||S(5)ARUL.A)IL 
se(0,t] 

where AR(,A) = UR(u,A) — J is a bounded operator. Since S(t)R(u,A) is compact 

for every t > 0, from Proposition 7.5 we obtain that R(u,A) is compact. This proves 

the claim. The compactness of R(A,A) for arbitrary A € p(A) now follows from the 

resolvent identity (10.2). 


(3): This follows from Lemma 12.25. 

Finally, for any fixed A € p(A), anonzero x € X is an eigenvector of A with eigenvalue 
L if and only if it is an eigenvector of A with eigenvalue Te Since the eigenspaces 
corresponding to the nonzero eigenvalues of the compact operator R(A,A) are finite- 
dimensional, the second part of (3) follows. 


(4): This follows from (3) and the next proposition. 


In the next proposition we denote by o,(B) the point spectrum of a bounded or un- 
bounded operator B, that is, the set of its eigenvalues. 


Proposition 13.21 (Spectral mapping theorem for the point spectrum). Let A be the 
generator of a Co-semigroup S on X. Then 


Op(S(t))\{0} =exp(top(A)), 120. 
Moreover, the eigenspaces corresponding to A € 6)(A) and e™ € op (S(t)) coincide. 


Proof Ifx € D(A) is an eigenvector of A corresponding to the eigenvalue A, the identity 
ih " eht—5)5(5)(A —A)xds = (e4! — S(t))x 

shows that S(t)x = e*'x, that is, e*’ is an eigenvalue of S(t) with eigenvector x. This 

proves the inclusion 6,(5(t))\{0} > exp(to,(A)). 

The inclusion o,(S(t))\{0} C exp(to,(A)) is proved as follows. Fix t > 0 and sup- 
pose that x € X is an eigenvector of S(t) corresponding to a nonzero eigenvalue 1. Then 
=e” for some A € C. The identity S(r)x = e*'x implies that the map s+ e~?°S(s)x 
is periodic with period f. Since this map is not identically zero, the uniqueness theorem 
for the Fourier transform implies that (after scaling the interval [0,r] to [0,271]) at least 
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one of its Fourier coefficients is nonzero. Thus, there exists an integer k € Z such that 
with Ay := A + 2nik/t we have 


1 t 
t= - | e *S(s)\xds £0. 
t Jo 
We will show that A; is an eigenvalue of A with eigenvector x,. 
Choose M > 1 and @ € R such that ||S(z)|| < Me’. By the t-periodicity of s 1H 
e~*5S(s)x, for all Rev > @ we have 


0 o0 (n+1)t 
R(v,A)x = e "’S(s)xds = y | e’*S(s)xds 
0 n=o7 nt 
ot 2 1 
= y | e “S(s)(e Y"S(nt)x) ds = » ony e 'S(s)xds (13.7) 
n=0"9 n=0 0 
1 1 


t t 
as vs _ -vs 
= orf e "S(s)xds = —ar | e “°S(s)xds. 


Since the integral on the right-hand side is an entire function of the variable v, this shows 
that the map v +> R(v,A)x can be holomorphically extended to C\ {A +2min/t: n€ Z}. 
Denoting this extension by F, by (13.7) and the definition of x, we have 


5 (U-e S(t) + ev) f 'e-¥*S(s)xds) = *(0+0)=0. 


0 
From the closedness of A it follows that x, € D(A) and (A, — A)x, = 0. 

It remains to prove the final statement on coincidence of eigenspaces. Let us denote 
the eigenspaces corresponding to A € 0p(A) and e*! € op (S(t)) by Ey and E25 tespec- 
tively. The first part of the proof shows that E, C E, ,. Denote by Fy and F, , the closed 
linear spans of {S(t)x: x € E,} and {S(t)x: x € E,,}. Then E, = Fy C F,, and the 
second part of the proof shows that F,, C E, (because the vector x, belongs to F; ,). 
Putting these inclusions together, we obtain 


Fx=hy Ck, Ck, and EF, Ck,, CK, CE, 


and therefore all these subspaces coincide. 
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13.3. The Abstract Cauchy Problem 


Having set up the general theory of Co-semigroups, it is time to put them to use in 
solving abstract Cauchy problems. 


13.3.a The Inhomogeneous Cauchy Problem 


If A is the generator of a Co-semigroup S on X, then by Proposition 13.4 for initial values 
ug € D(A) the function 

u(t) :=S(t)uo, 1 >0, (13.8) 
solves the initial value problem (ACP), 

u'(t)=Au(t), t€ [0,7], 

u(0) = uo, 
in the sense that u is continuously differentiable, takes values in D(A), and satisfies 
the equation pointwise in time. A function u with these properties is called a classical 
solution. However, the definition (13.8) makes sense for arbitrary up € X, not just for 


ug € D(A), and for all uo € X the function u(t) = S(t)uo solves the following integrated 
version of (ACP): 


u(t) = uo +A [| u(s)ds, t€ [0,7]. (13.9) 


Indeed, by Proposition 13.4(3), for arbitrary up € X we have fj S(s)uods € D(A) and 
A Jj S(s)uo ds = S(t)uo — uo, confirming that (13.9) holds for u(t) = S(t)uo. This obser- 
vation leads to the notion of strong solution which we develop next in the more general 
context of the inhomogeneous Cauchy problem 
u(t) =Au(t)+ f(t), t€ [0,7], 
() =Au(t) +f, r€ (0,7) was 
u(0) = uo, 
with initial value ug € X. We assume that f belongs to L'(0,7;X), the space of all 
strongly measurable functions f : (0,7) + X such that 


T 
Wf = ff @llde <ee, 


identifying functions that are equal almost everywhere. In same way as in the scalar- 
valued case one shows that L!(0,7;X) is Banach space. 


Definition 13.22 (Strong solutions). A strong solution of (IACP) is a continuous func- 
tion uw: [0,7] — X such that for all t € [0,7] we have Jo u(s)ds € D(A) and 


u(t) = uo +A fi u(sas+ [fleas 


448 Semigroups of Linear Operators 


We proceed with an existence and uniqueness result for strong solutions of (IACP). 
It is based on the following lemma. 


Lemma 13.23. Let f € L'(0,T;X). Then: 


(1) for allt € [0,T] the function s + S(t —s)f(s) has a strongly measurable represen- 
tative and is integrable on {0,t]; 
(2) the function t + Jj S(t —s) f(s) ds is continuous on [0,T]. 


Proof (1): Choose a strongly measurable representative for f, which we denote again 
by f, as well as a sequence of simple functions f, converging to f pointwise. Each 
function s +> S(t —s)f,(s) is strongly measurable, since it is a linear combination of 
functions of the form s +> 1g(s)S(t—s)x with B C [0,7] a Borel subset, and such func- 
tions are strongly measurable because continuous functions on an interval are strongly 
measurable and if f is strongly measurable and B is a Borel set, then 1gf is strongly 
measurable. By Proposition 1.48, their pointwise limit s+ S(t —s)f(s) is strongly 
measurable. Integrability follows from the estimate ||S(t— s)f(s)|]| < M||f(s)||, where 
M = sup,jo,7]||S(t)||, and the integrability of f. 


(2): LettO<t<t'<T.Then 


| [/se-areas- [ "s(t—s) f(s)ds 


¥ " s(t! —s) f(s) ds +| [ SO 07 Ode [ " s(t—s) f(s)ds}). 


‘| 


The first term on the right-hand side can be bounded above by M fia | f(s) || ds which 
tends to 0 by dominated convergence as rt’ —t — 0. The second term tends to 0 by 
dominated convergence as well: for simple functions f this follows from the strong 
continuity and local boundedness of the semigroup, and for general f € L'(0,7;X) this 
follows by approximation by simple functions in the L!(0,7;X)-norm. 


Theorem 13.24 (Existence and uniqueness). For all ug € X and f € L'(0,T;X) the 
problem (IACP) admits a unique strong solution u. It is given by the convolution formula 


u(t) = S()uo+ [ " s(t —s) f(s) ds. (13.10) 


If f € LP(0,T;X) with 1 < p<, thenu € L?(0,T;X). 


The function f ++ S(t)uo + fy S(t —s) f(s) ds is usually referred to as the mild solution 
of (IACP). 


Proof For the existence part we will show that the right-hand side of (13.10) defines a 
strong solution. By Lemma 13.23, this function is continuous. 
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We begin by showing that {j u(s) ds € D(A). We have {} S(s)uods € D(A) by Propo- 
sition 13.4. To prove that fy [9 S(s—r) f(r) drds € D(A) we apply the definition of A. 
As h | 0 we have, using Fubini’s theorem, 


nf [’se-nse) r)drds = ["(s(h) =1 "(on f(r) asar 
7 “F(s(h) -1 Oran 
0 0 
+ fal S(s) f(r) ds dr 
“A S(t—r) f(r) — f(r) dr 


ae 


The convergence is justified by the dominated convergence theorem since for all 0 < 
h < 1 we have the following pointwise bound with respect to the variable r: 


rest) 1 f° "sts)¢0ryas|| = A [ sisyrtepas— ["sisyr0ryas 
h 0 0 


t—rt+h | storth 
ay ists) f0ids+ 4 f istg0mas 
<2MIFO)Ih 


where M; := supg<z<+1 ||5(T)||- 
The above computation shows that fj u(s)ds € D(A) and 


A [w sjds=a [sts wods+4 | [sis— f(r) r) drds 
= (S(t)uo — uo) 4 (u(r) S(t)uo [foer). 


This shows that the function u given by (13.10) is a strong solution. 

To prove uniqueness, suppose that wu and u are strong solutions of (ACP). It follows 
from the definition that u and w are continuous. Set v := u—u. Then v is continuous, 
Jov(s) ds € D(A), and v(t) =A {5 v(s) ds for all ¢ € [0,7]. 

Fix 0 <1 <T and define w : [0,t] > X by w(s) = S(t—s) Jo v(r) dr. This function is 
differentiable with derivative 


w'(s) = S(t—s)v(s = s(r—s)a [vl S(t—s)v(s) — S(t —s)v(s) =0. 
It follows that w is constant. Hence 


i v(r) dr = w(t) = w(0) =0. 
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Since this is true for all 0 <t < T and v is continuous, it follows that v(r) = 0 for all 
ré [0,7]. 
The final assertion is a consequence of Young’s inequality. 


The solution u depends continuously on uo in the norm of C([0,7];X), the Banach 
space of all continuous functions from [0,7] to X. Indeed, if up is another initial value 
and the corresponding unique strong solution is denoted by u, then 


Ilu(t) — a@)|] < |] S| N40 — Hol] < M||u0 — vo, 
where M := sup,<jo,7j ||S(¢)||, and therefore 
Iu — ttl]o < M||uo — uo). 


Unique solvability plus continuous dependence on the initial value is usually sum- 
marised as well-posedness. Thus, the inhomogeneous problem (IACP) is well posed 
for strong solutions. 


13.3.b The Semilinear Cauchy Problem 
In this section we study a class of nonlinear evolution equations of the form 


oe t € [0,7], 


SCP 
u(0) = uo. Pail 


Equations of this form are referred to as semilinear equations. We assume that A gener- 
ates a Co-semigroup S on X and that the initial value ug lies in X. We make the following 
assumptions on the function f : [0,7] x X > X: 


(i) Untegrability) for all x € X the function t+ f(t,x) is Bochner integrable on [0,7]; 
(ii) (Linear growth) there exists a constant C > 0 such that 


IIf(t.x)| <CU+|lxl), + € [0,7], xe x; 
(iii) (Lipschitz continuity) there exists a constant L > 0 such that 
IIf(t,x) —F (t,x) || <Lljx—2'|], t € [0,7], xx eX. 


Under these assumptions, in force throughout this section, we will prove existence, 
uniqueness, and continuous dependence on the initial conditions of mild solutions. Thus 
(SCP) is well posed for mild solutions. 


Definition 13.25 (Mild solutions). A function u : [0,7] — X is called a mild solution of 
(SCP) if it is continuous and satisfies 


u(t) = S(0)uo+ f S(t) fls,u(s))ds t € [0,7]. 
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To see that this is well defined we must check that the integral converges as a Bochner 
integral in X. Lemma 13.26 takes care of this. Taking the lemma for granted for the 
moment, let us first motivate the definition. 

First of all, it generalises the formula of Theorem 13.24 for the strong solution of the 
inhomogeneous problem. Perhaps more importantly, we shall prove that every classical 
solution is a mild solution. To prepare for this, suppose that u : [0,7] + X is not just 
continuous but continuously differentiable and takes values in D(A). Then it makes 
sense to ask whether wu satisfies (SCP) in a pointwise sense. If it does, we call u a 
classical solution. Let us assume that this is the case. Multiplying the equation for s € 
[0,t] on both sides with S(t — s) and integrating, we obtain 


i; “Ste Glas [ ‘SG -Saaiaee: [ Sra ould\de 


On the other hand, an integration by parts, using that u(0) = uo and S’(t)x = S(t)Ax for 
x € D(A), gives the identity 


[ ste-sul(s)as = ule) S(t)uo 4 [se s)Au(s) ds. 


Substituting this identity into the preceding one, the identity defining a mild solution is 
obtained. 

In general there is no reason to expect existence of classical solutions, but, under the 
standing assumptions (i)—(iii) formulated above, a unique mild solution always exists. 
In the definition of a mild solution, no differentiability or D(A)-valuedness is imposed, 
and this is precisely what makes things work. 

As promised we now check that the integral in Definition 13.25 is well defined as a 
Bochner integral in X. The next result extends Lemma 13.23 to the present situation. 


Lemma 13.26. Let f : [(0,T] x X — X satisfy the conditions (i)—(ii) and suppose that 
u: [0,7] — X is continuous. Then: 


(1) the functions s++ f(s,u(s)) and s++ S(t—s)f(s,u(s)) have strongly measurable 
representatives and are integrable; 
(2) the function t+ {9 S(t —s)f(s,u(s)) ds is continuous on {0,T}. 


Proof (A): First let v= ye 1), ®x; be an X-valued step function, where the intervals 
I;  [0,T] are disjoint. If s € 7), then f(s,v(s)) = f(s,x;) and therefore s +> v(s, f(s)) 
belongs to L'(0,7;X) by the integrability assumption (i). Moreover, using the linear 
growth assumption, 


T k k T 
f, Wisw(onlas= f lftoapllds se IolC + lhs) =€ f+ Holds 
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If v’ is another X-valued step function, from the Lipschitz continuity assumption (iii) 
we obtain the estimate 


T 
if IIF(s,v(s)) — f(s,v"(s)) Ids < LT |v —v'lec. 


Since u : [0,7] > X is continuous, we can find step functions u, : [0,7] + X such 
that ||u —un||oo < 1/n. Then, for m,n > N, 


1 
lun —Um|loo < ||Un — Ulloo + || — Um|loo < FF + 


so the functions s++ f(s,un(s)) form a Cauchy sequence in L'(0,7;X). By the com- 
pleteness of L'(0,7;X) they tend to a limit, say g. Moreover, after passing to a sub- 
sequence, we may assume that lim, ,.. f(s,un(s)) = g(s) for almost all s € [0,7]. By 
modifying the functions on a common Borel null set as in the proof of Lemma 13.23 
we may even assume that the convergence holds pointwise. On the other hand, for all 
s € [0,7] we have 


sim 


Il f(s, un(s)) — F(s,u(s))I S 


It follows that g(s) = f(s,u(s)) for almost all s € [0,7]. In particular, this proves that 
s++ f(s,u(s)) has a strongly measurable representative and belongs to L'(0,T;X). 


(2): This follows by applying Lemma 13.23 to the function s+ f(s,u(s)). 


We are now ready to state and prove our main result: 


Theorem 13.27 (Well-posedness of the semilinear problem). Under the assumptions 
(1)—(i1) formulated at the beginning of the section, the semilinear problem (SCP) admits 
a unique mild solution u € C([0,T];X). This solution depends continuously, in the norm 
of C([0,T];X), on the initial condition uo € X. 


Proof To obtain existence and uniqueness we define a nonlinear mapping ® from 
C((0,7];X) to itself by 


(P(v))(t) := S(t)uo +f S(t—s)f(s,v(s))ds, t€ [0,7], ve C((0,T];X). 


We have already observed in Lemma 13.26 that the integrand is integrable, and the 
continuity of ®(v) follows from the strong continuity of the semigroup and Lemma 
13.23. It follows that ® is well defined as a mapping of C([0,7];X) into itself. We now 
reuse the idea in the proof of Lemma 2.14 and set, for a parameter A > 0 to be chosen 
in a moment, 

lIglla = sup e™|Ig()|). 


tE [0,7] 
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This defines an equivalent norm on C((0,7];X). By the Lipschitz continuity assumption, 
for all v,w € C([0,T];X) and t € [0,7] we have 


(PC) (4) — (PW) OI S ri IIS(¢— 5) (F(s,v(s)) — f(s, w(s))) | ds 


<LM | ee~*'|\v(s) —w(s)|| ds 
0 


f LM 
< im [|v —wilads = "(e¥ — 1llv—wlh, 
0 Xr 
where M = sup,¢jo,7)||S(¢)||. It follows that 
5) LM 
lO(v) —PW)Ila < F-A-e “Iv wlla < Folly —wlla- 


If we choose A > LM, the mapping © is a uniform contraction on C([0,7];X) and 
therefore has a unique fixed point u € C([0,7];X) by the Banach fixed point theorem 
(Theorem 2.13).fixed point!argument Then, 


u(t) = (B(u))(t) = S(t)uo + [st —s)f(s,u(s))ds, t€ [0,7], 


so u is a mild solution. Conversely, any mild solution is a fixed point of ®, and since ® 
has a unique fixed point the mild solution u is unique. 

To complete the proof we check the continuous dependence of the mild solution on 
the initial value uo. If up is another initial value and the corresponding unique mild 
solution is denoted by u, estimating as before we obtain 


Ilu(t) — a(t) I < |]S@|IIe0 — oll + ts I|S(t— 5) (F(s,u(s)) — F(s,u(s))) || ds 
LM 
Xr 


< M||uo — ito|| +(e — 1) ||u itl, 


and therefore 
lua] < Milo —tl| + => uA. 
Choosing A = 2LM gives 
1 - oy 
3 lle — ullaem < M]luo — oll 


and the desired continuity follows, keeping in mind that || - ||2z.7 is an equivalent norm 
on C((0,7];X). 


For this this to be useful, one must have ways to ‘translate’ nonlinearities occurring in 
concrete partial differential equations into our abstract framework. We demonstrate how 
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this works by means of an example. Consider the following semilinear heat equation on 
a nonempty bounded open subset D of R?: 


OM ,) =Au(t,8) +(u(t,8)), GED, 1 [0.7] 
u(t,€) =0, €€o0D, te (0,T], 


u(0,5) =uol(S), 5 €D. 


We assume that b: R > R is Lipschitz continuous, with Lipschitz constant L: 


JW(S)-O(S ISLS —S'], 6,6" ER. 


The assumption that b only depends on the solution is made for simplicity; the more 
general case where b also depends on time can be treated in the same way. 

To cast this problem into a semilinear abstract Cauchy problem we assume that the 
initial value ug belongs to L?(D). The above problem may then be written in the form 


u(t) = Au(t) + B(u(t)), 
u(0) = uo, 


where A is the Dirichlet Laplacian on L?(D), which generates an analytic Cy-contraction 
semigroup on this space (see Proposition 13.49), and B : L?(D) + L?(D) is the Nemyt- 
skii mapping associated with b: 


(B(x))(§) = D(x), xe L*(D). 


The next proposition checks that this mapping is well defined, of linear growth, and 
Lipschitz continuous on L?(D) (and, with the same proof, on L?(D) with 1 < p <©). 


Proposition 13.28. Under the above assumptions on b, the Nemytskii mapping B : 
L?(D) — L?(D) is well defined, of linear growth, and Lipschitz continuous (in the sense 
that f (t,x) := B(x) satisfies conditions (ii) and (iii) at the beginning of this section). 


Proof Let us first check that B(x) € L?(D) for all x € L?(D). Using the triangle in- 
equality in L?(D), for all x € L? me we have 


IB) = (f loote))eas) 
_ nae) “+ (f io(oyPae) 


<u Ine) —oF ag)” + 16(0)1( 148)” = cbse +101"), 


where |D| stands for the Lebesgue measure of D. This proves that B is well defined and 
of linear growth. 
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Lipschitz continuity follows by a similar estimate. For all x,y € L*(D), 
B®) —BO)|law = (f eG) -60)| “a” 


a ig eae = |x ll) 


We have thus shown that all assumptions of Theorem 13.27 are fulfilled. Accord- 
ingly we obtain unique solvability of the semilinear heat equation, in the sense that the 
corresponding abstract Cauchy problem admits a unique mild solution. 


13.4 Analytic Semigroups 


Analytic semigroups provide an abstract framework for discussing a class of initial 
value problems, referred to in the partial differential equations literature as parabolic. 
An important characteristic of this class of problems is that solutions are smooth. 


13.4.a The Main Result 


For @ € (0,2) consider the open sector 
Lo := {ze C\ {0}: |arg(z)| < o}, 
where the argument is taken in (—7, 7). 


Definition 13.29 (Analytic Co-semigroups). A Co-semigroup S on X is called analytic 
on Xo if for all x € X the function t +> S(t)x extends holomorphically to Yq and satisfies 
lim S(z)x=x. 
ZELq,2-70 
We call S an analytic Co-semigroup if it is an analytic Co-semigroup on YX» for some 
@ € (0,7). 


If S is an analytic Co-semigroup on Lq@, then for all z;,z2 € L@ we have 
S(z1)S(z2) = S(z1 +22). 


Indeed, for each x € X the functions z; +> S(z1)S(t)x and S(z; +t)x are holomorphic ex- 
tensions of s++ S(s+t)x and are therefore equal. Repeating this argument, the functions 
za + S(z1)S(z2)x and S(z1 + z2)x are holomorphic extensions of tf + S(z; +1)x and are 
therefore equal. 

As in the proof of Proposition 13.3, the uniform boundedness theorem implies that if 
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S is an analytic Co-semigroup on Y@, then the operators S(z) is uniformly bounded on 
Lo MB(0;r) for every 0 < w’ < w and r > 0. The same argument as in Proposition 13.3 
then gives exponential boundedness on ©, for all 0 < @’ < @, in the sense that there 
are constants M’ > 1 and c' =cw € R such that 


IS) < Mel, ce Ly. 


We say that S is a bounded analytic Co-semigroup on X if S is an analytic Co-semigroup 
on Xm and the operators S(z) are uniformly bounded on Lq. Analytic Co-contraction 
semigroups on X@ are defined similarly. There is a rather subtle point here: Boundedness 
and contractivity are imposed on a sector, not just on the positive real line. That this 
makes a difference is shown by simple example of the rotation group on C”, given by 


cost —sint 
t) = ; 
ay oe cost ) 
For each t € R we have ||S(r)|| = 1. Upon replacing t by a complex parameter z the 
group extends holomorphically to the entire complex plane, but it is unbounded on every 
sector X@ with 0 < @ < 7. It may even happen that a bounded analytic Co-semigroup is 


contractive on the positive real line, yet fails to be an analytic Co-contraction semigroup; 
an example of such a semigroup on C? is discussed in Problem 13.14. 


Theorem 13.30 (Bounded analytic semigroups, complex characterisation). For a dens- 
ely defined closed operator A in X the following assertions are equivalent: 


(1) A generates a bounded analytic Co-semigroup on Ly for some n € (0, 50); 
(2) there exists 0 € (47,2) such that X@ C p(A) and 


sup ||AR(A,A)|| <. 
AEXo 


Denoting the suprema of all admissible n and @ in (1) and (2) by @noio(A) and @res(A) 
respectively, we have 


1 
Qres(A) = 57 af holo (A). 


Under the equivalent conditions (1) and (2) we have the inverse Laplace transform 
representation 

ol 
Oni 


S(t)x | eR(A,A)xdd, t>0,xEX, (13.11) 
| 


where T = V1 is the upwards oriented boundary of Lg \ B, for any WE (42, 0) and 
any closed ball B centred at the origin. 


Note that (2) implies o(A)MiR C {0}. 
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Proof By Cauchy’s theorem, if the integral representation holds for some choice of 
le (32, @) and a closed ball B centred at the origin, then it holds for any such 6’ and 
B. 


(1)=(2): We start with the preliminary observation that if a linear operator A gen- 
erates a uniformly bounded Co-semigroup S on X, then, by Proposition 13.8, the open 
right half-plane C, = {Red > 0} is contained in the resolvent set of A and we have the 
bound ||R(A,A)|| <M/ReA for all A € C,. Moreover, for all @ € (0, 57) and A € Xe 
we have Red > |A|cos(@) and therefore 


x M 
sup ||AR(A,A)|| < : 
AELe cos 8 


Now if A generates a Cp-semigroup which is bounded on a sector Ly, with n € (0, 51), 
say by a constant M, we can apply the above reasoning to the bounded semigroups 
S (e'N't), with 7’ € (0,77), and obtain (2). Optimising the various choices of angles we 
obtain the inequality 

Ores (A) S xt + @holo(A). 

(2)=(1): The idea is to define the semigroup operators by the integral representa- 
tion given in the statement of the theorem, and prove that they define a bounded Cp- 
semigroup which has the properties stated in part (1). 

Once we have this, it is fairly straightforward to deduce (1) with 7 = 0 — 50; this is 
done in the second step and gives the inequality 


1 


@Mholo(A) S @res(A) a ae 


Step 1-Let n := @ — 5. For any € € Xp let 


S(Q)x:= x | AERA Alaa, xeX, 

2ni Jr 
where I is the boundary of Lg \ B with 6’ € (0, 57) any number such that 50 + 
|arg(z)| < 6’ < @ and B is any closed ball centred at the origin; see Figure 13.1. This 
integral converges absolutely and defines a bounded holomorphic function on the sector 
Ly: 

The proof of the semigroup property proceeds much in the same way as the proof of 
the multiplicativity of the holomorphic calculus. Fix ¢,¢/ € ZX, and choose contours 
andI” as above, with I” to the right of I. Then, by the resolvent identity (10.2), Cauchy’s 
theorem, Fubini’s theorem, and the Cauchy integral formula, 


S(¢")S()x = Tai [, fees RO. A)RUA) dA 
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Figure 13.1 The contour I 


= 1 aceper R(A,A)x— R(u,A)x 
~ (ni)? fhe 7 eee 


= 1 aceper R(A,A)x 
= oat She Tog een 


= sal [eto R(A)auaa =S(6+6')x. 
2ni Jr Jr 


Put M := supycy, ||AR(A,A)|| and fix ¢ € Ly. To estimate the norm of S(¢)-x, by 
Cauchy’s theorem we may take ! =Igg, with B, = B(0;r) the ball of radius r and 
centre 0, where we take 50 + |arg(z)| < 0’ < @ as before; the choice of r > 0 will be 
made shortly. The are {|z| = 7, | arg(z)| < @’} contributes at most 


1 M 0'M 
a 20'r-exp (rig) = 9™ exp(rigl), 
while each of the rays {|z| > 7, arg(z) = +0’} contributes at most 


M [@ ; 1 M 
Seas = i ei ys es i 
ag |, eP(-elélleos(6")I) 40 < 55 eer 
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It follows that 


ud "exp(r soot 
Isil< = (@ p( IoD + sar osey])” 


Taking r = 1/|¢| and letting 6’ + @ we obtain the uniform bound 


, 1 
" |cos( 


M 1 
s()Il< —(0 ), $€En, n= 50-8. 
IS@)I< 5 (6e+ ey) SEE 15 
It remains to prove strong continuity on each sector £, with 0 <n’ <n. Let x € 
D(A). Fix € € Ly and write x = R(u,A)y with w € Ly, \ Ly. Inserting this in the in- 
tegral expression for S(¢)x, using the resolvent identity to rewrite R(A,A)R(u,A)y = 
(R(A,A) — R(u,A))/(u — A), and arguing as above, we find that the integral corre- 
sponding to the term with R(u,A) vanishes by Cauchy’s theorem and the choice of u 
and obtain 


Oni 
Letting ¢ — 0 inside ©, we see that S(C)x converges to 


1 


= fu U4) RN Dyak RA ye 


Cae [et u-ay Ra Alyaa. 


by dominated convergence. 

This proves the convergence S(¢)x — x for x € D(A) as ¢ — 0 inside Ly. In view 
of the uniform boundedness of S(¢) on Xy, the convergence for general x € X follows 
from this. 


The following result characterises analytic Co-semigroups directly in terms of the 
semigroup and its generator, without reference to the resolvent. Its importance lies in 
the smoothing property revealed by (2): the semigroup operators S(t) map every x € X 
into the smaller subspace D(A) for all t > 0. 


Theorem 13.31 (Bounded analytic semigroups, real characterisation). Let A be the gen- 
erator of a Co-semigroup S on X. The following assertions are equivalent: 


(1) Sis bounded analytic; 
(2) S(t)x € D(A) for all x € X andt > 0, and 


sup t||AS(t)|| < °c. 
1>0 


Remark 13.32. By writing S(t) = [S(4)]”, part (2) self-improves as follows: for all 


L 
n 


x €X andt > 0 we have S(t)x € D(A”) for all x € X andt > 0, and 


sup t”||A"S(t)|| =: Mn < ©. 
t>0 


This will be used in the proof below. 
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Proof of Theorem 13.31 (1)=(2): Fix t > 0 and x € X. Arguing as in the proof of 
Proposition 13.8, from the integral representation (13.11) with T = 0(Z@ \ B) we deduce 
that S(t)x € D(A) and 


= Ae At 
AS(t)x = = i eR(LA)AxdA. 


The integral on the right-hand side converges absolutely since 


sup ||AR(A,A)|| = SUDA A)—1|| < ©. 
Aer Ae 

By estimating this integral and letting the radius of the ball B in the definition of I tend 
to 0, it follows moreover that 


Me Pe oes 
t||AS(t)x|| < =lhall rePFO8?' dp = IlxlI- 


7|cos 0’| ne | 

(2)=(1): For all x € D(A”) the mapping f +> S(t)x is n times continuously differen- 
tiable and S$) (t)x = A"S(t)x = (AS(t/n))"x. Since D(A”) is dense in X, the bounded- 
ness of AS(t/n) and closedness of the nth derivative in C({0,7];X) together imply that 


the same conclusion holds for x € X. Moreover, 


Cc n” 


IIs @zxll < 


7 lll: 


where C is the supremum in (2). From the inequality n! > n”/e" (which follows from 
Stirling’s inequality) we obtain that for each t > 0 the series 

= ri 

= —(z—1)"S (2)x 


on! 


converges absolutely on every ball B(t;rt/eC) with 0 <r < 1 and defines a holomor- 
phic function there. The union of all these balls is the sector X, with sinn = 1/eC. 
We shall complete the proof by showing that S(z) is uniformly bounded and satisfies 
lim,_,9S(z)x = x in Ly, for each 0 < 7’ < 7. To this end we fix 0 <r < 1 so that the 
union of the balls B(t;rt/eC) equals Ly. For z € B(t;rt/eC) we have 


< Yl = fa us 


This proves uniform boundedness on the sectors X,,. To prove strong continuity it then 
suffices to consider x € D(A), for which it follows from estimating the identity 


em | 
I|S(z)x|] < pers (r/ecy"© 


97! 


. r . 
S(z)x-x= ei [ S(se!® )Axds 
0 


where z = re!®. 


13.4 Analytic Semigroups 461 
13.4.b The Lumer-Phillips Theorem 


The main result of this section is the Lumer—Phillips theorem, which gives a charac- 
terisation of analytic Co-semigroups of contractions in Hilbert spaces. We begin with a 
useful lemma about extending resolvent bounds from a half-line to a sector. 


Lemma 13.33. Let A be a linear operator acting in X. If the open half-line (0,2) is 
contained in p(A) and 

sup |AR(A,A) || =: M <e, 

A>0 
then M > 1, the open sector © := {A € C: |arg(A)| < arcsin(1/M)} is contained in 
p(A), and 


M 
R(A,A)|| < ——_______.... 
owls (1,4) | 1 — Marcsin(1/M) 


The argument is always taken in the interval (—7,7.). 


Proof For x € D(A) we have AR(A,A)x =x+R(A,A)Ax > x as A — ©, from which 
it follows that M > 1. 

Proposition 10.27 implies that for every u > 0 the open ball with centre y1 and radius 
1/||R(u,A)]|| is contained in p(A). The union of these balls is a sector; we shall now 
verify that the sine of its angle equals at least 1/M. 

Let g € (0,arcsin(1/M)) C (0,52). Fix 2 € Xo and let p > 0 be determined by 
the requirement that the triangle spanned by 0, A, w has a right angle at A (thus, by 
Pythagoras, |A — |? +|A|* = |w|*, so uw = —|A|*/|ReA]). Let @ denote the angle of A 
with the positive real line. See Figure 13.2. Then |A — u|/|u| = sin@ < sing < 1/M, so 
|A — p| < |u|/M < 1/||R(u,A)||. Hence A € p(A) and by the estimate for the Neumann 
series in Proposition 10.27, 
IRANI < IRAN | 


n=0 


u\” . 
jul I ( Ik 


MS M 1 
< — Y (sing)"M" < ——___.. _. 
I 1—Msing |A| 


A typical application of this lemma is the second part of the next corollary, which 
extends a uniform bound on AR(A,A) on a half-plane to a larger sector. For reasons of 
completeness we also include its counterpart for uniform bounds on R(A,A). 


Lemma 13.34. Suppose that the half-plane C,. = {Red > 0} = {larg(A)| < 57} is 
contained in the resolvent set p(A) of the linear operator A acting in X. Then: 
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Figure 13.2 The proof of Lemma 13.33 


(1) ifsupyce, ||R(A,A)|| <9, then there exists a 6 > 0 such that {ReA > —6} C p(A) 
and 
sup ||R(A,A)|| <; 
Rea>-—5 
(2) if supyec, ||AR(A,A)|| <°%, then there exists a 6 > 0 such that {\arga| < 5a + 
5} C p(A) and 
sup ||AR(A,A) || < 2. 


jargA|<42+6 


Proof Proposition 10.29 implies that in case (1) we have iIR C p(A), and that in case 
(2) we have iR \ {0} C p(A). The result now follows from Proposition 10.27 applied to 
the points A € iIR (in case (1)) and Lemma 13.33 applied to the operators +iA (in case 


(2). 


In Hilbert spaces with have the following characterisation of contractive analytic Co- 


semigroups (for an extension to Banach spaces see Problem 13.18). 


Theorem 13.35 (Lumer—Phillips, analytic contraction semigroups). Let A be a densely 
defined closed operator in a Hilbert space H and let0 <n < 50. The following asser- 
tions are equivalent: 


(1) A generates a contractive analytic Co-semigroup on H on the sector Xp; 
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{lzl = |Al} 


Figure 13.3 |A — (Ax|x)| > |A| 


(2) u—A has dense range for some L > 0 and —(Ax|x) € Liga for all x € D(A). 
Proof By a multiplicative scaling of A we may assume that u = 1. 
(1)=(2): Let |n’| <n, x € D(A), and consider the function f(t) := Re(S(te'’ )x|x). 
Observe that f(0) = ||x||?. Furthermore, for all t > 0 we have 
LF(0)| = [(S(te"™ )xlx)| < }S(te"™ all xl] < |lall?, 


where we used that S is contractive on £;,. From these two observations we infer that 
f'(0) < 0. On the other hand, differentiating f gives 


f' (t) = Re(e!”’ S(te'™ )Ax|x). 
Evaluating at t = 0 gives 
Re(e!” (Ax|x)) <0. 
This can only be true for all |n’| < 17) if (Ax|x) € Lian: 


(2)=+(1): Set A := re!’ with r > 0 and |7'| <7. We want to show that for all x € D(A) 
we have ||(A — A)x|| > r||x|| = |A|||x||. For this we may assume that ||x|| = 1. 

From A € Ly and —(Ax|x) € Lh n it is easy to see that |A — (Ax|x)| > |A|. See 
Figure 13.3. As a consequence, 
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||(A —A)x|| 2 [((A —A)x]x)| = |A — (Ax|x)] 2 [A] = [A [lla (13.12) 


From this inequality and Proposition 10.26 we infer that A — A has closed range. 
Therefore, to show that this operator is invertible, it suffices to show that it has dense 
range. This will be deduced from the assumption that J— A has dense range. Since J— A 
has also closed range, we have in fact 1 € p(A). Now suppose, for a contradiction, that 
some A; € Xy belongs to o(A). Set A; := (1-1) +A). Then A, € Xp for all t € [0,1]. 
Let fo := inf{t € [0,1]: A, € o(A)}. Then fg € (0, 1] and limp, ||R(A;,A)|| = 2 since 
resolvent norms diverge as we approach the boundary of the spectrum by Proposition 
10.29. This clearly contradicts (13.12), which tells us that ||R(A;,A)]] < |A;| < 1+ |A1| 
for all t € [0, to). 

We have now shown that £ C p(A) and ||R(A,A)|| < |A| on this sector. By Lemma 
13.33, the resolvent of A extends holomorphically to a slightly larger sector Xy where 
it is still true ||[R(A,A)|| < M|A|, for some finite constant M (possibly greater than 1). 
Theorem 13.30 then implies that the semigroup generated by A is bounded analytic on 
the sector Ly Lg Its contractivity on the smaller sector ae La is obtained by applying 


the Hille-Yosida theorem to the operators eA with |n’| < 1. 


The conditions of the theorem are satisfied if —A is a positive selfadjoint operator on 
H. In that case, the semigroup S' generated by A is given by 


S(t) = yt 7 oe dP(A), 


where P is the projection-valued measure associated with —A. More generally, this for- 
mula can be used to associate a Co-semigroup of contractions with every normal opera- 
tor A; see Theorem 13.63. 

With the same proofs, both Theorem 13.35 and its corollary extend to 1 = 0, pro- 
vided we interpret Xo as the positive real line and replace ‘analytic Co-semigroup of 
contractions’ by “Co-semigroup of contractions’. The theorem then takes the following 
form: 


Theorem 13.36. Let A be a densely defined operator on a Hilbert space H. The follow- 
ing assertions are equivalent: 


(1) A generates a Co-semigroup of contractions on H; 
(2) u—A has dense range for some Lb > 0 and — Re(Ax|x) > 0 for all x € D(A). 


The condition ‘— Re(Ax|x) > 0 for all x € D(A)’ says that —A is accretive. Since 
the open half-line (0,00) is contained in the resolvent set of any operator generating a 
Co-semigroup of contractions, in Theorems 13.35 and 13.36, the condition ‘4 — A has 
dense range for some [t > 0’ may be replaced by ‘1 € p(A)’. An accretive operator —A 
satisfying 1 € p (A) is also called a maximal accretive operator, or briefly, an m-accretive 
operator. 
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13.4.c Semigroups Associated with Forms 


The first estimate of Theorem 12.12 shows that the assumptions of the Hille-Yosida 
theorem are fulfilled. The second estimate, combined with Corollary 12.13 and Lemma 
13.34 applied to A — 6 and A, implies that the conditions of Theorem 13.30 are fulfilled. 
Thus we obtain the following result. 


Theorem 13.37 (Bounded analytic semigroups via forms). Let V and H be Hilbert 
spaces, with V continuously and densely embedded in H, and let —A be the densely 
defined closed operator in H associated with a bounded accretive form on V. Then A 
generates a Co-contraction semigroup on H. Moreover, 


(1) for all 6 > 0 the operator A — 6 generates a bounded analytic Cy-semigroup on H; 
(2) if, in addition to the above assumptions, a is coercive, then A generates a bounded 
analytic Co-semigroup on H. 


Example 13.38 (Operators in Divergence Form I). Let D be a nonempty bounded open 
subset of R4 In H = L?(D) we consider the divergence form operators 

Aq = div(aV) 
of Section 12.3.e subject to Dirichlet conditions. As in that section we assume that the 


matrix-valued function a: D> M,(C) satisfies 


(i) the functions a;; : D — C are measurable and bounded; 
(ii) there exists a constant @ > 0 such that 


d we) 
Re y aij (x) Eis ; > 0, é E C4. 
ij=l 
The operator —A, is rigorously defined as the densely defined closed operator associated 
with the form 
a(u,v) = | aVu-Vvdx 
D 
on V = H4(D). The form az is satisfies the assumptions of the first part of Theorem 


13.37. 
If the accretivity assumption 


of Section 12.3.e is replaced by the coercivity condition 


d 
Re ): aii (x) Gi j 2 alg |? 


ijl 
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of Section 11.3.b, with @ > 0, then the second part of Theorem 13.37 can be applied. 


We show next that generators of analytic Co-contraction semigroups are obtained if 
the range of a is contained in the closure of a subsector strictly contained in the open 
right-half plane. 


Definition 13.39 (Sectorial forms). Let0<@< 50. A form a on A is called @- 
sectorial if 


a(v):=a(v,v)E€Lo, ve D(a). 


Theorem 13.40 (Analytic contraction semigroups via forms). Let H be a Hilbert space 
and let A be the densely defined closed operator in H associated with a densely defined 
closed for a that is @-sectorial for some0) <@< 50. Then —A generates an analytic 


Co-semigroup of contractions on the sector X law 


Proof This is an immediate consequence of Theorem 13.35. 


Example 13.41 (Operators in divergence form II). Consider again the divergence form 
operator 


Aq := div(aV) (13.13) 


in L?(D), subject to Dirichlet conditions. As before we assume that D is a nonempty 
bounded open subset of R“ We now assume that the matrix-valued function a: D > 
Mz(C) satisfies 


(i) the functions a;; : D + C are measurable and bounded; 
(ii) there exists a constant @ > 0 such that 


d 
Y aijGié;> a6? 6 €C* 
ij=l 
The uniform ellipticity condition (ii) is stronger than the corresponding condition of 
Example 13.38, in that no real parts are taken. It implies that the form a, of Example 
13.38 takes values in [0,°°), so it is @-sectorial for all @ € (0, 51). Accordingly, —Ag 
generates an analytic Co-semigroup of contractions on every sector Xg with 6 € (0, 50). 


Sectorial forms of angle less than 5% are continuous and accretive; this clarifies the 
relationship between Theorems 13.37 and 13.40. Accretivity is clear, and continuity 
follows from the following proposition. 


Proposition 13.42. Let a be an @-sectorial form on H with0<@< 50. Then a is 
continuous and for all u,v € D(A) we have 


Ja(u,v)| < (1+ —~) (Rea(u))'/(Rea())!” 
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Proof By the Cauchy—Schwarz inequality applied to the form Rea, 
|Rea(u,v)| < (Rea(u))!/?(Rea(v))!/?. 


Let us now consider the imaginary part Ima := x(a —a*) (where a*(u,v) := a(v,u) for 
u,v € D(a*) := D(a)). Fix u,v € D(a). Replacing v by e'v if necessary we may assume 
that Ima(u,v) € R. Then Ima(u,v) = Ima(v,u) and therefore, by w-sectoriality, 


|Ima(u,v)| = 7 {ima(w +vu+v) —Ima(u—v,u—v)| 
1 
< 
< Trang (Reale v,u+yv) +Rea(u—v,u—v)) 
1 
aaa (Rea(u) +Rea(v)). 
Replacing u and v with ,/éu and v/\/€ gives 
1 
|Ima(u,v)| < VF samy (VERE alu u)+ je Real): 


If Rea(u) 4 0 we may take € := Rea(v)/Rea(u) and obtain 
1 1/2 1/2 
|Ima(u,v)| < ano (Real) (Rea(v))'/*). 
If Rea(u) = 0, then 


|Ima(u,v)| < 


ea(v), 


and applying this with u and v replaced by u/6 and dv, upon letting 6 | 0 we obtain 
Ima(u,v) =0. 


2tan @ 


13.4.d Maximal Regularity 


In Section 13.3 we have seen that the mild solution u of the inhomogeneous problem 
u’ = Au+ f with initial condition u(0) = uo, which is given by 


(tuo + vi "s(t—s)f(s)ds, 120, 


is also a strong solution, that is, for all t > 0 we have fj u(s)ds € D(A) and 


u(t) =o +A fu s)ds+ | f(s)ds 


In general it cannot be asserted, however, that 


th= w+ [ Aus) ast [16 ds 
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the problem being that u need not take values in D(A) almost everywhere, and even if it 
does so we cannot be certain that s+ Au(s) is integrable on intervals (0,1). The aim of 
the present section is to prove that these things do hold if A generates a bounded analytic 
Co-semigroup on a Hilbert space. 


Theorem 13.43 (Maximal regularity). Let A be the generator of a bounded analytic Co- 
semigroup S on a Hilbert space H. Then for all f € L?(R4;H) the mild solution u = uf 
of the inhomogeneous problem u' = Au + f with initial condition u(0) = 0 enjoys the 
following properties: 


(1) u belongs to D(A) almost everywhere, Au belongs to L?(R4;H), and for almost all 
t > 0 we have 


t t 
u(t) -| Au(s) as+ | f(s) ds; 
0 0 
(2) we have 
||Aull2 < Cll fll2, 
where C = SUPE ER\ {0} |AR(E ,A)]]. 


By the Lebesgue differentiation theorem (Theorem 2.39, or rather its the vector- 
valued version which is proved in exactly the same way), the identity in (1) implies 
that u is differentiable almost everywhere and that the pointwise identity 


w(t) = Au(t) + f(t) 


holds for almost all ¢ > 0. This, in combination with (2), implies that also u belongs 
to L?(R,;H) (with estimate ||u'||2 < (C +1)||f||2). This explains the name ‘maximal 
regularity’ attached to the theorem. 

We begin with a reduction to a class of “nice” functions f. To this end, for subspaces 
F and ¥Y of L?(R,) and H respectively, we introduce the notation F @ Y for the vector 
space of all linear combinations of functions f : R, — H of the form f = @ @ y with 
@ € F andy € Y, where 


(@ @y)(t) :=O(t)y, t>0. 


If F and Y are dense in L7(IR,) and H respectively, then F @ Y is a dense subspace of 
L?(R;;H). This is because the dt-simple functions are dense in L7(IR,;H) and every 
such function is a linear combination of functions of the form 1g ®h with B a Borel set 
of finite measure and h an element of H; we now approximate 1g with functions in F 
(with respect to the norm of L? (R+)) and h with elements in Y (with respect to the norm 
of H). 

In what follows we consider the dense subspaces F = C}(IR;) and Y = D(A). For 
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functions f € C}(R+) @ D(A) the mild solution of the problem u! = Au + f with initial 
condition u(0) = 0, given by 


u(t) = i ‘s(t—s)f(s)ds, 120, 


is continuously differentiable in H, takes values in D(A), and satisfies u’(t) = Au(t) + 
f(t) for every t > 0. 


Lemma 13.44. Let A be the generator of a bounded analytic Co-semigroup S on a 
Hilbert space H. If there exists a constant C > 0 such that for all f € Cl(R+) @ D(A) 
the mild solution u associated with f satisfies Au € L?(R4;H) and 


|Aull2 < Cllfll2, 


where C > 0 is a constant independent of f, then the assertions (1) and (2) of Theorem 


13.43 hold for all f € L?(R1;H), with the same constant C. 


Proof Since C}(R4) ® D(A) is dense in L7(R4;H), for any f € L?(R,;H) we may 
choose functions f, € C!(IR;) @ D(A) converging to f in L?(R+;H). Writing u and up, 
for the mild solutions corresponding to f and f,, respectively, the assumptions imply 
that the functions Au, form a Cauchy sequence in L?(IR,;H) and therefore converge to 
a limit v in L?(R4;H). 

Using the Cauchy-Schwarz inequality for L7(0,7;H) and taking the supremum over 
t € [0,7], we obtain 


len — Umlle(o,ryen) < T!!? (\lAutn — Aum||72(0,.7:4) + fn — Smlli2(0,70))- 


It follows that the functions u, converge uniformly on every interval [0, 7] to a function 
u. As a result, for all t > 0 we obtain 


u(t) + [vs ds = [ro ds. 


Since A is closed, a standard subsequence argument furthermore gives that v takes values 
in D(A) almost surely and v = Au in L?(R,;H). 


The proof of Theorem 13.43 relies on the observation that the Fourier—Plancherel 
transform F on L?(R) extends to an isometry from L7(R;H) onto itself, defining its 
action on simple functions in L? (IR; H) by the prescription 


F (1g @h) = (F1g) @h 


for Borel sets B of finite measure and elements h € H and extending this definition by 
linearity. That this extension enjoys the stated properties can be proved in exactly the 
same way as in the scalar-valued case, repeating the proof given for that case word by 
word with the obvious adjustments. 
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Proof of Theorem 13.43 For functions f € C}(R)® D(A), the mild solution u asso- 
ciated with f takes values in D(A) and satisfies Au = Vf in L?(R4;H), where 


t 
Vf(t):= | AS(t—s)f(s)ds, 1>0. 
0 
Thus the assumptions of Lemma 13.44 are satisfied if we can show that V f € L?(IR;;H) 
for all f € C(R,) @ D(A) and 
IVFll2<Cllfll, feCo(R+)@D(A), 


where C is the constant from the statement of the theorem. In order to set the stage for 
the Fourier transform we translate this into a statement about functions defined on the 
full real line. Let K(t) := AS(t) for t > 0 and K(t) := 0 for t < 0, and define 


Vf(t):= [Ke-9F66) ds, feCl(R,)@D(A), reR. 


Then Vf =V f for functions f € C} (IR;) ®@ D(A) (where on the left-hand side we think 
of f as being extended identically zero to all of R), so it suffices to prove that V maps 
C}(R) @ D(A) into L?(R;H) with bound 
Vfll2<Cllfll, f €Cc(R)@ D(A). (13.14) 
This will be achieved by showing that 
Vf=Trf, f€Ci(R)@D(A), (13.15) 


where 7, is the (operator-valued) Fourier multiplier operator with 


m() :=AR(ig A) =igR(iE,A)—-T,  § © R\ {0}, 
that is, 
Inf = F'(m¥ f), f €L?(R;H). 


To see that the operator T,,, is well defined and bounded, we note that since A generates 
a bounded analytic Co-semigroup the function 


m(S) = AR(ig,A) =igR(ig,A)—1, § €R\ {0}, 


is uniformly bounded. Moreover, by holomorphy, this function is continuous from R \ 
{0} into Y(H). As a consequence, the mapping g ++ mg given almost everywhere by 
applying m(&) to g(&) is well defined and bounded on L?(IR; #1) as required, with norm 


\|Zm|| = sup ||AR(in, A). 
neER\{0} 


This gives (13.15) as well as (13.14) with the correct value for C. 
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In order to prove the identity (13.15) we must show that 
FV f(th=mf, f €C)(Ry)@D(A). 


At least formally, the operator V has the form of a convolution with K, so in view of 
Proposition 5.30 one is led to believe that the identity FV f = F(Kx f) =V 2nKf = 
mf should hold since, at least formally, 


V2nK(E) aie eS K(t)dt 


= feitase a= A fe #s(2) dt = AR(iE,A) = m(E), 
0 0 


and this would give the desired result. None of the steps in this formal argument is 
rigorous, however, and the remainder of the proof is devoted to presenting a rigorous 
version of it. 
We “mollify” both V and m by defining, for r > 0, the regularising operator 
B(r) =—(1—rA)“'A(r—A)! 
PG SAY FSSA) 
= (tay ray lee tet aay 


(13.16) 


= 
= 
> 
as, 
a 


If y € R(A), say y = Ax, then 


=(r7! =A) N(r—A) ly = (1 Ayr A) '[(r—A) =r] 
= (r-}—A)1x—r(r—A) “1 (r-1 — A). 


As r|.0we have r~!(r~!— A)~!y > y by Propositions 13.10 and 13.8, and the latter one 
implies ||r(r—A)~!|| <M and ||(r~!—A)~!|| <M/r7|, where M is as in the proposition. 
Combining these observations, we find that 


SS eas x€ D(A). (13.17) 


Moreover, (13.16) implies that ||B(r)|| < (M/r~!)(M/r) +M = M? +M. 
Define m,(§) = m(§)B(r) for € #0, and 


V.f(t):= i: "AS(t)B(r) f(s) ds = 7 _ Kets) f(s)ds, f €C1(R) @DIA), 


where K,(t) = AS(t)B(r) for t > 0 and K,(t) = 0 otherwise. 
By Theorem 13.31 and the uniform boundedness of S(t) for t > 0, 


K-(@)I| = NA° 7A) NGA)" < G,/? 


and 


IK-()l| < |S@)I|AC ra)" AGA) I< C,, 
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where C,, is independent of t > 0. It follows that K, € L'(IR;-(H)) and thus, by domi- 
nated convergence and Proposition 13.8, 


V2aK,(E) =1im [ e+)" K, (1) dt = limA(n + i€ — A)~!B(Y) = m, (E). 
710 JO 710 


Therefore, V; = Tn, on C} (IR) @ D(A). 
We now let r | 0. By (13.17) and dominated convergence, for f = g@x € C!(R)@ 
D(A) and t € R we have 


t 
vs =f g(s)S(t—s)B (Ards | als S(t—s)Axds =V f(t) 
0 
for every t € R. Similarly, by (13.17) and the uniform boundedness of the operators 


Bir), 


mr(E)F(E) = BCE) (iE — A) 'B(r)Ax — B(E)(iE —A)'Ax = m(E)F(E), 


with convergence in L?(IR;H). Therefore, Tn,f + Tm in L?(IR;H), and along an ap- 
propriate subsequence we also have almost everywhere convergence. This shows that 
Vf =Tmnf, and by linearity this implies Vf = Tf for all f € C! (IR) @ D(A). 


We demonstrate the usefulness of maximal regularity by proving local existence for 
the time-dependent inhomogeneous Cauchy problem 


a =A(t)u(t)+f(t), t€ [0,7], 


13.18 
u(O) = 0, 


where (A(f));<{0,7] is a family of densely defined closed operators on a Hilbert space H. 
We make the following assumptions: 


e each domain D(A(t)) is isomorphic to a fixed Banach space D which is continuously 
and densely embedded in H; 

e the mapping t+ A(t) € (D,H) is continuous on [0,7]; 

e the operator A(0) is invertible and generates a bounded analytic Co-semigroup on H. 


The idea is to rewrite the problem in the form 


u'(t) =A(O)u(t) + gu (t) with g,(t) := (A(t) -—A(0))u(t) + f (0). 


Now let 0 < a < T and consider a fixed function u € £7 (0, a;D). Referring to Theorem 
13.24, denote by K,(u) the mild solution of the inhomogeneous problem 


He =A(O)u(t)+gu(t), 1 € [0,4], (13.19) 
u(O) = 0. 


Then, at least formally, the solutions of (13.18) are the fixed points of K,. The maximal 
regularity of A(O) will now be used to show that K, is a uniform contraction (that is, 
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its norm is strictly smaller than one) in L7(0,a;D) provided 0 < a < T is small enough. 
The Banach fixed point theorem then gives the existence of a unique fixed point for K, 
in L?(0,a;D). This fixed point will be called the solution on (0,a). 

Indeed, if uy,u2 € L?(0,a;D), then K,(u1) — Kq(uz) equals the solution u of 


W(t) =A(O)u(t) + 8u() — 8un(t), uO) = 0. 


Since D is isomorphic to D(A(0)), which is a Banach space with respect to the norm 
x++ ||[A(0)x|| since 0 € p(A(0)), we obtain with the maximal regularity inequality for 
the problem (13.19) on the interval (0, a) that 


|| Ka(u1) — Ka(u2)|I12(0,a;0) = IA ()ell22 0:47) 
<C|l8uy — 8u2|l22(0,0:41) 
= C||[A(-) —A(0)] (1 — 42) Il2(0.a:42) 
<C sup ||A(t)—A(0)||_¢(0,=) lle — “all 22(0.a:0)- 


tE[0,a] 
To justify the first inequality we extend the inhomogeneity g,, — g,, identically 0 on 
[a,c) and observe that the mild solution for the resulting inhomogeneous problem on 
IR, restricts to u on the interval (0,a). 
If the constant a > 0 is small enough, then 


sup{||A(t) —A(0)|| : <a} <1/C 


and ||Ka|| Y(L2(0,a;p) < 1, and the Banach fixed point theorem provides a unique solution 
for (13.18) on (0,a). 
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If A is a positive selfadjoint operator on a Hilbert space H, then —A satisfies the con- 
ditions of Theorem 13.35. Denote by S the analytic Co-semigroup of contractions gen- 
erated by —A. It can be shown (see Problem 13.12) that for all t € R and x € H the 
limit 


t)x:=li +it 
U(t)x “ta it)x 


exists and that the family (U(t)),cR is a Co-group of unitary operators with genera- 
tor iA. The main result of this section is Stone’s theorem, which asserts that for any 
selfadjoint operator A it is true that the operators +iA generate Co-groups of unitary 
operators. For the proof of this theorem we need the following auxiliary result. A more 
precise version for bounded selfadjoint operators has been proved in Theorem 8.11. 
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Proposition 13.45. If A is a selfadjoint operator on a Hilbert space H, then o(A) CR 


and 


1 
a 
IRA) < ar 


AEC\R. 
Proof LetA =a@+if with B #0. For all x € D(A) we have (Ax|x) = (x|Ax) = (Ax|x) 
and therefore (Ax|x) € R. Then, 


l|(A —A)z||llal] > |((A —A)xlx)| = Joel) — (Axle) + iB (xlx)| > [BI Ia? 


and therefore ||(A —A)x|| > B||x||. This implies that A — A is injective and by Proposition 
10.26 it has closed range. The same argument can be applied to A and allows us to con- 
clude that A — A is injective and has closed range. Moreover, using that ((A — A)x|y) = 
(x|(A —A)y), the injectivity of A — A implies that A — A has dense range. 

We conclude that A — A is bijective, hence invertible, and from the inequality ||(A — 
A)x\| > [B|llxl] we see that |®(A,A)|| < 1/iB}. 


Theorem 13.46 (Stone). For a densely defined operator A in H, the following assertions 
are equivalent: 


(1) A is selfadjoint; 
(2) iA is the generator of a Co-group of unitary operators. 


Proof (1)=(2): By Proposition 13.45, o(A) is contained in the real line and for all A € 
C\R we have ||R(A,A)|| < 1/|ImA]. Hence, by the Hille-Yosida theorem (Theorem 
13.17), the operators +iA generate Co-contraction semigroups S;. Hence by Proposition 
13.13 iA generates a Co-group of contractions given by 


Also, since (iA)* = —iA* = —iA, we have S_(t) = S(t) and vice versa, from which it 
follows that the operators U(+r) are unitary. 


(2)=(1): Suppose that iA generates the unitary group (U(t)),er. From U(—t) = 
(U(t))~! =U*(t) we see that (U*(t));c is a Co-group as well. To determine its gener- 
ator, which we call B for the moment, suppose that x € D(A) and A € D(B). Then 


(x|Bh) = lim *(x|U*(@)h—h) = lim = (U(#)x—a\h) = (iAx|h). 


This shows that h € D(A*) and —iA* = (iA)*h = Bh. In the converse direction, if h € 
D(A%*), then for all x € D(A) we have 


(x|—iA*h) = (iAx|h) = lim = (U(#)x—x1h) = lim +(x|U*(1)h— h) = (x|Bh). 
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This shows that h € D(B) and Bh = —iA*h. We conclude that B = —iA* with equal 
domains. The identity 

1 1 

“(U(-t)x—3) = -(U*@)x—x) 


then shows that x € D(A) if and only if x € D(A*) and —iAx = Bx = —iA*x. 


Some applications of this theorem will be given in the next section (see Sections 
13.6.g and 13.6.h. 


13.6 Examples 


In this section we collect some important examples of Co-semigroups and Co-groups. 


13.6.a Multiplication Semigroups 
Let (Q,.F,) be a measure space and let m: Q —> K be measurable with real part 
bounded from below: 


inf Rem(@) =: M > —o, 
@eQ 


The operators 
S()f ef, 120, 
are bounded on L?(Q), 1 < p < , with norm ||S(r)|| < e~™. We will prove that S is a 
Co-semigroup on L?(Q) for 1 < p < ce, with generator A given by 
D(A) := { f € L?(Q) : mf € L?(Q)}, 
Af:=—-mf, f€D(A). 
Fix 1 < p < 0, The semigroup properties (S1) and (S2) are clear and (S3) follows by 


dominated convergence. To prove (13.20) let f € L?(Q) be such that mf € L?(Q). For 
L-almost all @ € Q we have 


(13.20) 


S(t) f(@) — f(@) =e" f(w) — f(@) = —m(o) f(@) [ el) ds, 
Also, by Proposition 13.4, S(t) f — f =A J S(s)f ds. It follows that 
4 'soirss=—mt ees 
Next we note that 


1 t 
lim - ds = 
Let Ea, 
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and 


1 ft 1 ft 
limA- | S ds = -li = ees 
ah ae (s) fds ee i e Ss mf 


in L?(Q). Since A is closed this implies f € D(A) and Af = —mf. 
Conversely, if f € D(A), then the limit 


exists in L?(Q) and equals Af. Since convergence in L?(Q) implies -almost every- 
where convergence along a subsequence, there is a sequence f, | 0 such that 


Af(o) = lim ~(e-™) f() — f(@)) 


n—-oo th 


for p-almost all @ € Q. Clearly, 


tim +(e") ¢(@) — f(@)) = —m(@) f(0). 


n—-oo th 


It follows that mf € L?(Q) and Af = —mf. 


13.6.b The Translation Group 
On the space L?(R), 1 < p < ©, the formula 
(S()f)(x):= f(xtt), x ER, teR, 
defines a Co-group S. Its generator A is given by 
D(A) =W'?(R), 
Ag 7). Ff SDA). 


The group properties (G1) and (G2) are clear and (G3) follows from Proposition 2.32. 
To prove that D(A) = W!?(R) and Af = f”, we first note that for f € C|(IR) we have 


SWF) Fs) = fa+)— Fe) = f fets)ds= [ s(9)f'@)as. 
It follows that S(t) f — f = Jo S(s),f’ ds. Also, S(t) f — f =A J§ S(s)f ds. It follows that 
Af S(s)f ds =| S(s) f’ ds. 
Next we note that 


. 1 ft 
lim — oar st 
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and 


1 1 
lima- 's(s ) fds = lim [ S(s)f'ds =f’ 
t0 t>0 t Jo 

in L?(R). Since A is closed this implies f € D(A) and Af = f”. 

Since C}(IR) is dense in L?(IR) and invariant under translations, from Proposition 
13.5 we infer that C!(IR) is dense in D(A). Since C!(R) is also dense in W!-?(IR) and 
IIfllocay = Fl + AFI = IF + II = [fll q@ for all f € Cc(R), it follows that 
D(A) =W!?(R) and Af = f" for all f € D(A) = Wh P(R). 


13.6.c The Heat Semigroup 


The Heat Semigroup on R@ For | < p < and t > 0 we define a linear operator H(t) 
on L?(IR@) by 


H(t) f(x) =K,* f(x), fe C.(R%),x€ RY, (13.21) 


where 
K,(x) := (4at) 420/41 
is the heat kernel. Since K, € L'(R“) with ||K;||1 = 1, it follows from Young’s inequality 


(Proposition 2.33) that for all 1 < p < © the operators H(t) are well defined and bounded 
on L?(IR¢) and satisfy 


A) Ff llocray < lI lize ceay: 
We furthermore set H(0) :=/, the identity operator on L?(IR“). We will prove that the 
family H = (H(t));s0 is a Cp-semigroup of contractions, the so-called heat semigroup, 


on L?(IR“) and that its generator A is the weak L?-Laplacian A. Thus the heat semigroup 
solves the linear heat equation 


u(0,x) = f(x), xER4 


in the sense that its orbits satisfy 4 { H(t) f = AH(t)f and H(0) = f. 


Step 1 — Fix 1 < p < ©. First we prove that H is a Cp-semigroup on L?(R). For all 
t > 0, by Lemma 5.20 and a change of variables the Fourier transform of K; is given by 


“a 1 Lis 1 _ el 
RS) = pan [Kee farm Gage, SER! 


It follows that (21)4/2K,K, = Ko for each t,s > 0, and by Proposition 5.30 this implies 
H(t+s)f =H(t)H(s)f for all f € L?(R¢). In particular this identity holds for functions 
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f € L?(R4) ML? (R4), and since we have already seen that the operators H(t) are con- 
tractive on L?(IR“) the identity H(t +s) f = H(t)H(s)f extends to general functions 
f € L?(R4), by the density of L?(R¢) NL?(R*) in L?(R*). 

Strong continuity of the semigroup is an immediate consequence of Proposition 2.34. 


Step 2 — We now prove that A = A, the weak L?-Laplacian, with equal domains. 
We begin by proving the inclusion D(A) C D(A) along with the fact that 


Af=Af, f€D(A). 


First let f € C2(IR¢). For all t > 0 we have the pointwise identities 


a a 
a Ki SAK, Kix f = AK *f, 


and therefore 
t t 
H(t)f—-f=Kx*f—f =i, AK, * fds =) AH (s) fds. (13.22) 


Since we are assuming that f € C2(R%), all identities can be rigorously justified by el- 
ementary calculus arguments. As we have already observed at the beginning of Section 
11.1.e, C2(R®) is dense in D(A). Since all terms in the above identity depend continu- 
ously on the graph norm of D(A), the identity extends to arbitrary functions f € D(A). 
Dividing both sides by t and passing to the limit ¢ | 0, by the continuity of t-> AH (t)f 
as an L?(R“)-valued function we obtain that f € D(A) and Af = Af as claimed. This 
completes the proof. 

To prove the converse inclusion D(A) C D(A) we must show that every f € D(A) 
admits a weak L?-Laplacian. To this end we multiply both sides of (13.22) with a test 
function @ and integrate by parts. This results in the identity 


[,Hose)-reoear= [ [ H19,F0)40G) axes. 


Dividing by ¢ and passing to the limit ¢ | 0, and using the assumption f € D(A), we 
obtain the identity 


[Af @)0@)ae= [£o)A9() ax. 


This identity precisely expresses that f admits a weak Laplacian, given by the function 
Af € L?(R4). 
By Theorem 11.29, for p = 2 we have 


D(A) = D(A) = W??(R4). (13.23) 


Remark 13.47. For 1 < p < ~ one has the analogous equality 


D(A) = D(A) = W?”(R4), 
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but this is highly nontrivial and depends on the L?-boundedness of the Riesz transforms 
(see the Notes to Chapter 5). For d = 1 there is the following more elementary argument 
that also works for p = 1. The inclusion W*? (IR“) C D(A) being clear for any dimension 
d, the point is to prove the inclusion D(A) C W?(R). If f € L?(IR) admits a weak L?- 
Laplacian Af = f”, Theorem 11.12 implies that f admits a weak derivative f’ belonging 
to L?(R). This shows that f belongs to W*?(R). 


Remark 13.48. There is a slightly different route to the identification A = A for p = 2 
which depends on the fact that each of the operators H(t) is a Fourier multiplication 
operator associated with the multiplier m,(§) = exp(—t|§ Ps Defining H(t)g := mg 
we obtain a multiplication semigroup H on L?(R¢) which is strongly continuous and 
whose generator Ais given by 


D(A) = {ge L’(R’): E+ |E/?e(E) €L?(R4)} = H7(R"), 
Ag(&) =—|E|?g(E), g€ D(A), & ER; 


this follows from the results proved in Section 13.6.a. This semigroup is related to the 
heat semigroup through the identity 


H(t)=F% 'oH(tho¥, t>0, 


from which it follows that a function f € L?(R“) belongs to the domain of the generator 
A of H if and only if ¥ f = f belongs to the domain of the generator A of H, in which 
case the identity 

Af =F !cAoFf 


holds. As we have seen, this is the case if and only if f € H7(R¢). Since H?(R¢) = 
W2(IR¢) up to equivalence of norm, this implies (13.23). 


The Heat Semigroup on Bounded Domains Let D be a nonempty bounded open set 
in R4 
Proposition 13.49. The Dirichlet and Neumann Laplacians on L? (D) generate analytic 


: fats : 1 
Co-semigroups of selfadjoint contractions on every sector of angle less than 51. 


Proof Everything follows from Lumer—Phillips theorem (Theorem 13.35), except the 
selfadjointness of the semigroup operators which follows from Euler’s theorem (Theo- 
rem 13.19) after noting that for A > 0 the resolvent operators are selfadjoint. 


Alternatively, Proposition 13.49 can be deduced from the spectral theorem for self- 
adjoint operators (as such the result is a special case of Theorem 10.54, where further 
details are provided). This gives the representation 


sey / etdP tt), Rez > 0, 
0,00 
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where P is the projection-valued measure associated with the Laplacian under consid- 
eration (see Example 13.64). 

The Dirichlet and Neumann heat semigroups on L*(D) are positivity preserving, that 
is, they map nonnegative functions to nonnegative functions. From the physics point 
of view it is natural to expect that heat semigroups should have this property, as they 
are meant to describe the time evolution of heat distributions. Positivity of the heat 
semigroup on L?(IR@) is evident from the explicit representation though convolution 
with the heat kernel. 


Theorem 13.50 (Positivity). Let D be bounded. Then the Cy-semigroups on L?(D) gen- 
erated by Apix and ANeum are positivity preserving. 


Proof Let A denote the Dirichlet or Neumann Laplacian on L?(D) and let S be the 
Co-semigroup generated by A on L?(D). We must prove that S(t) f > 0 for all t > 0 
whenever f € L” (D) satisfies f > 0. In what follows we fix such a function f. 


Step I — We first prove that for all A > 0 we have g := R(A,A)f > 0. Since the 
positive and negative parts g* and g~ of g have disjoint supports, they are orthogonal 
in L?(D) and therefore (g*|g) = +||g*||?. Furthermore, Theorem 11.23 implies that if 
g € H'(D), then g* € H!(D) and djg* = 1;4..50}0jg, and this in turn implies that 
In Ve-Ve* dx = + Jp ||Vg*||? dx. Combination of these facts gives 


0<Allg” |? =A(g" |g") =—-A(sle) = (fle) — (Asie) 
<—(Asle”) = | Ve-Ve-ar=— f Ive" |Par<o, 


the middle inequality being a consequence of the fact that f > 0, and the equality 
following being a consequence of the definition of A as the operator associated with 
the form on the right-hand side of the equality. This proves that g~ = 0 in L?(D), so 
R(A,A)f =g=g" 20. 

Step 2 — The positivity of the operators S(t) follows from the result of Step 1 via the 
Euler formula (Theorem 13.19) 

n 
s()f=jim(*R(2,4))"f, feP(). 


noo \ ft 


In Section 13.6.c we have seen that the Laplace operator A generates a Co-semigroup 
of contractions on L? (R?) for all 1 < p < c. For bounded open subsets D of R¢ up to 
this point we have only considered the analogues of this semigroup on the space L?(D). 
We prove next that the Dirichlet and Neumann Laplacians also generate Co-semigroups 
of contractions on the space L?(D) for 1 < p < ©. This will be derived from an abstract 
result on L?-boundedness of submarkovian operators which we discuss first. 
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Let (Q, F,W) be a finite measure space. A bounded operator T on L?(Q) is called 
doubly submarkovian if it has the following properties: 


Gi) Tf > 0 for all f > 0; 
Gi) Tl <1landT*1<1. 


Such operators enjoy the following extension property. 


Theorem 13.51 (L?-Boundedness of doubly submarkovian operators). Let (Q,-¥,U) 
be a finite measure space and let T be a doubly submarkovian operator on L’(Q). 
Then: 


(1) for all 1 < p < ~~, the restriction of T to L?(Q) ML? (Q) has a unique extension to 
a contraction on LP (Q.); 

(2) the restriction of T to L?(Q)NL®(Q) has a unique weak* continuous extension to 
a contraction on L®(Q.). 


Proof For all f € L'(Q)NL?(Q), 


fly =WIT Alla < TL = (IAD 
= (TIF||D) = (fI|T*1) < (FID) = WALD = IA = Wf lh 


Hence T has a unique extension to a contraction on L!(Q). The same argument applied 
to T* gives a unique extension of T*|,1 (Q)nr2(Q) ta contraction on L!(Q), and it is easy 
to verify that the Hilbert space adjoint of this extension agrees with T on the weak*- 
dense subspace L?(Q) NL*(Q) of L*(Q). This implies that 7| 12(Q)nL*(Q) has a unique 
weak*-continuous extension to a contraction on L*(Q). Boundedness and contractivity 
for 1 < p < ~ now follow from the Riesz—Thorin interpolation theorem. 


Turning to the L?-boundedness of the heat semigroup, we begin with the case of 
Neumann boundary conditions. 


Theorem 13.52 (L’-Boundedness, Neumann boundary conditions). Let D be bounded 
and let SNeum denote the Co-semigroup generated by ANeum to L? (D). Foralll < p<», 
the restriction of SNeum(t) to L?(D)OL?(D) uniquely extends to a Co-semigroup of 
positivity preserving contractions on L? (D). 


Proof By Proposition 13.49 and Theorem 13.50, for all t > 0 the operator SNeum(f) 
is selfadjoint and positivity preserving. From Aneum1 = 0 it follows that SX.,,,,(¢)1 = 
SNeum(t)1 = 1 and therefore the operators SNeum(t) are doubly submarkovian. Applying 
Theorem 13.51 we obtain that for all 1 < p < and ¢ > 0 the restriction of SNeum(t) 
to L?(D)ML?(D) has a unique extension, also denoted by Syeum(t), to a contraction on 
L?(D). 
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By Hélder’s inequality, the strong continuity of SNeum on L?(D) implies that for all 
1 < p<2and f € L?(D) we have 


|| Seum(t)f — fllp < [D|'/"||Sweum(t)f — fll2 > 0 


ast | 0, where +4 = o Since ||SNeum(t)||p < 1 for all t > 0, the density of L7(D) 
in L?(D) implies that the strong continuity extends all f € L?(D). For 2 < p < » we 
use selfadjointness to see that for all f € L?(D) NL?(D) and g € L?(D)NL4(D) with 


eres ee 
—~+-= 1 we have 
pt aq 


|(SNeum(t) f — f,.8)| = |(SNeum (t \f- f|2)| = |(f|SNeum (t )g-8)| 
= (f, SNeum(t)8 — 8)| < || Fllpl|SNeum(t)8 — gllq + 9 
applying in the last step what we already proved to the exponent | < g < 2. Using the 
contractivity of the operators SNeum(ft) on L?(D) and L4(D), by density (13.24) extends 


to arbitrary f € L?(D) and g € L1(D). It follows that the semigroup SNeum is weakly 
continuous on L?(D). By Theorem 13.11, this implies its strong continuity. Positivity 


(13.24) 


on L?(D) follows from the positivity of L?(D) by a density argument. 


Our next aim is to prove the following analogue of Theorem 13.52 for the Dirichlet 
heat semigroup. 


Theorem 13.53 (L’-Boundedness, Dirichlet boundary conditions). Let D be a bounded 
open subset of R? and let Spix denote the Co- semigroup generated by Apiz to Lv’ (D). For 
all 1 < p < ©, the restriction of Spir to L?(D) ML? (D) extends to a Co-semigroup of 
positivity preserving contractions on L?(D). 


The heart of the matter is to prove the following resolvent inequality. 
Lemma 13.54. For all 0 < f € L’(D) and A > 0 we have 
0 < R(A, Apir) f < R(A, Aneum) f- 


Proof Fix A >Oand0< f € L?(D). Then 0 < u:= R(A, Apir) f € D(Apir) and 0 < 
= R(X, ANeum) f € D(ANeum) and 


Au— Apiru = f = Av— Aneumv. (13.25) 


The theorem will be proved by showing that this implies u < v. 
Fix a nonnegative test function @ € C?(D). Multiplying (13.25) on both sides with @ 
and integrating by parts, we arrive at 


Af upax+ f vuvoar=A | vodr+ | vwwoas. (13.26) 


We claim that this equality extends to all nonnegative functions @ € Hi (D). Indeed, if 
on + @ in Hd (D) with ¢, € C2(D) for all n> 1, then ¢,' + @ in Hj (D) by Theorem 
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11.23. Since each @,* has compact support, mollification with a nonnegative compactly 
supported smooth mollifier allows us to approximate @,* by nonnegative test functions 
in Hi (D) as in Proposition 11.22. Applying (13.26) to these test functions and taking 
limits, the claim is obtained. 

Next we claim that (u—v)* belongs to Hj (D), the point here being that u € Hj (D) 
but v € H'(D). To prove the claim let uz > u in Hj(D) with u, € C?(D). Since v > 0, 
for each k > 1 the function (uz —v)* is supported in the compact support of u; and 
therefore it belongs to H} (D) by Proposition 11.22. By another application of Theorem 
11.23 it then follows that (w—v)* = limy_,..(ug —v)* belongs to Hd (D) as claimed. 


By (13.26), which we may now apply to @ = (u—v)*, 


Af ulu—vyrar+ | vuv(u—vyrde=A [ o(u—vyr are [ Wu v)* ae 
As a consequence, 
Af (w-v)Pde=A f w—v)(u-v)"* dx 
= [ Vu-vvu-vtdr=— f Vu—v*Par<o, 


arguing as in the proof of Theorem 13.50 in the last step. This implies that (u—v)* <0, 
that is, u < v. 


Proof of Theorem 13.53 By the lemma and Euler’s formula (Theorem 13.19), 


Spir(t) f = lim (=R(2, Apis) f < Jim (7R(=,ANeum)) f =S SNeum(t) f. 


no \ ft 
In particular this implies Spir(t)1 < SNeum(¢)1 = 1. Together with positivity and self- 
adjointness, this implies that the operators Sp;,(f) are doubly submarkovian. The proof 
can now be finished along the lines of that of Theorem 13.52. 


13.6.d The Poisson Semigroup 


Let 1 < p < ~. Fort > 0 we define the operator P(r) on L?(IR“) by convolution with 
the Poisson kernel 
Cat d 
x) = ——_,—., t>0, x€R’, 13.27 

Pi(x) (2 + |x|) 2D ( ) 
where cq =T( 5 (d+ 1))/x2 (@+1) with P(t) = fot !e* dx the Euler Gamma function. 
The change of variables y = x/t gives the norm estimate 

ipl a! 

=C —— 
PNB) Sed (1 [yD 
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. ! aol ! T'(34) 
= CqOq— SEE Epa dr=cj0,7_1-:— i 1 
ome if (1-4 r2)8@40 a 2 Fd +1) 
where oy_, = 2724/2 /T(44) is the surface area of the unit sphere in R¢. Young’s in- 
equality guarantees that the operators P(t) are well defined and contractive on L?(R¢). 
For d = 1, the formula (13.27) takes the simpler form 


1 f¢ 
=-~—,, tt R. 
Pr(x) Pg >0, x€ 
Define 
P(t) f= pix f 


for t > 0 and f € L?(R“) and supplement this definition by setting P(0) := J. To see 
that the operators P(t) satisfy the semigroup property we will show that the Fourier 
transform of p; is given by 


PE) = Gan ("IE (13.28) 


To prove this identity we compute the inverse Fourier transform of the exponential on 
the right-hand side. By a standard contour integration argument, for y > 0 we have 


= el” d 
e => e 
if 1+y? - 


Writing 7 ce —7 =f e —(L+y")u du, using Fubini’s theorem to interchange the order of in- 
tegration, and Lemma 5.20 and a change of variables to evaluate the inner integral, we 


obtain 
ev= ap fe —(14y") “dudy 


= e"( Pe” “dy) du = =f 5 “eo P/M ay, 
We i ae 


We apply this with y = t|€|. Using Fubini’s theorem, Lemma 5.20, and another substi- 
tution, we obtain 


1 1 
(2n)4/2 We (2n)4/2 


exp(—t|x|) exp(ix-&) dx 


ent —1?|x|2 Uu , 
= a a veh Th /4" exp(ix-€) dudx 


oo eo 


“val val oa if, oP B/4" exp(ix- E) dxdu 


7 yesera ae d 


2h. oe ee 
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sah Et | * (HP ul? A(A-1) ay 
0 


7 4+) td 
= i ! t 1 fay 
m2(4+1) (42 4 |x|2)2(4+1) Jo 
1 t 1 
a T(=(d+1))= , 
mz (d+1) (p24 |x|2)2 (441) (3(4+1)) = prs) 


This completes the proof of (13.28). Thanks to this identity, for t,s > 0 we obtain 


Bi * Ps = (20)4/? Py = (20) ~4/? exp(—t|E|) exp(—s|E|) 
= (2m) ~4/? exp(—(t+5)|E|) = Pras 


and therefore p; * Pps = P;+s. It follows that for, say, f € C.(R4 5 
P(t)P(s)f = pi (Ds * f) = (Pr * Ps) *f = Prss* f =P(t+s)f. 


Since C,(IR“) is dense in L?(R“) for 1 < p < © this proves that P(t)P(s) = P(t+s) for 

all t,s > 0. This identity of course trivially extends to f,s > 0. Strong continuity of the 

family P on L?(R¢) for 1 < p <0 is an immediate consequence of Proposition 2.34. 
We will determine the generator A of this semigroup for p = 2. It will turn out that 


A=-(-A)'?, D(A) =H (R4). 


Here, the square root first is defined by the functional calculus of the selfadjoint oper- 
ator —A (see Proposition 10.58) or by the following more direct argument. Recall the 
definition of H'(IR“) as the space of all f € L7(IR) for which 


E> (1+16)7)'7F(E) 


belongs to L?(R%). In view of the trivial inequality |E| < (1+ |&|?)!/*, for all f € 
H'(R“) the function € +> |E|f(E) belongs to L?(IR“). Thus we can define an opera- 
tor (B, D(B)) by 
Bf = (Er |élf(E)), D(B)=H'(R*). 
Note the formal analogy with the definition of Fourier multiplier operators; the only 
difference here is that the multiplier function m(&) = |&| does not belong to L*(R?) 
and is therefore not covered by the definition of these operators. For f € H?(R“) we 
similarly have 
—Af = (Er |EPF(E)) 

so that Bf € D(—A) and B? f := B(Bf) = —Af. This justifies the notation 


(-A)'/?:=B, D((—a)"?) = 1 (R*). (132) 
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From 
(BFL) = (I: FOLFO) = f BILAE Pas > 0 


we see that B is positive. The uniqueness part of Proposition 10.58 implies that B co- 
incides with the positive square root of —A obtained in the corollary by means of the 
functional calculus of —A. 

In dimension d = | the identity || = —isign(€)-i€ implies that 


d 
_ q2 2512 one 
(—d* /dx*) How. 


where H is the Hilbert transform (which, as we recall from Section 5.6, is the Fourier 
multiplier operator corresponding to the multiplier € + —isign(€)). 


Theorem 13.55 (Poisson semigroup). The Poisson semigroup on L7(R®@) is generated 


by the selfadjoint operator — (—a)!/2, 


Selfadjointness follows from Example 10.37. 


Proof We start by noting that a function f € L?(IR“) belongs to H!(R¢) if and only if 
Er |é IF(E) belongs to L?(R“). The ‘only if’ part has already been noted, and the ‘if’ 
part follows from the inequality (1 +|€|?)!/? < 1+ &| in the same way. 

On L?(IR“) we now define the multiplication semigroup Q by 


O(t)g(E) =e “Fl e(é) 


fort >Oand ge L?(R?). As we have shown in Section 13.6.a, this is a Co-semigroup 
whose generator (C,D(C)) is given by 
Ca(S) = —|6 a(S) 
for g € D(C) = {g € L?(R4): € 4 |E|9(€) € L?(R*)}. Evidently, 
P(t)=F'oQ(tho¥, t>0, 
from which it follows that a function f € L?(IR“) belongs to the domain of the generator 


A of P if and only if Ff = f belongs to the domain of the generator C of Q, in which 
case the identity 


Af =F% !oCoFf 
holds. By the observation at the beginning of the proof and (13.29), Ff belongs to 
D(C) if and only if f ¢ H' (IR?) = D((—A)!/?), and in that case we have 


nw 


-(-A)'P f= (EH -|E|f) =F 1 0CoFf. 


These considerations prove that A = (—A) !/2 with equality of their domains. 
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The operator (—A)!/ ? has an interesting connection with the wave equation which 
will be elaborated in Section 13.6.h. 


13.6.e The Ornstein—Uhlenbeck Semigroup 


In this section we assume some elementary knowledge about Gaussian random vari- 
ables. Let 


1 1 
dy) = Gyan exp(—3I+1*) dr 


denote the standard Gaussian measure on R“. For t > 0 and fe C.(R4 ), define the 
operator OU(t) on L?(R4, 7) by 


OU (1) f(x) := i. fletxt V1—e%y)dyy), xe Rt 


We will prove that OU is a Co-semigroup on L?(R4, 7) for all 1 < p<. This semigroup 
is known as the Ornstein—Uhlenbeck semigroup. It plays a central role in the so-called 
Malliavin calculus, an infinite-dimensional Gaussian version of calculus which finds 
applications in, for example, the theory of stochastic (partial) differential equations and 
mathematical finance. Interestingly, this semigroup also makes its appearance in Quan- 
tum Field Theory, where its negative generator takes the role of the so-called bosonic 
number operator. This point of view will be taken up in Section 15.6. 

Let us first show that each operator OU (t) extends to a bounded operator on L’(R4, 7) 
of norm at most 1. By Hélder’s inequality, for all f € C.(IR“) we have 


OU) = foal fofletx+ VI= 2) art)" ara) 


< ff \ttetx+ Vi-ey)]! x(x) d70) 


=E|f(e'X+ l-e-y)|”, 


=i 


where X and Y are independent standard Gaussian random variables defined on some 
probability space and E is the expectation with respect to the probability measure. Since 
(e')? + (V1 —e-2')? = 1, the random variable e~'X + V/1 — e~“'Y is standard Gaussian 
again, so it is equal in distribution to X. Hence 


~ 
i 


flex VI eF¥)/? = ELK)! = fh LF) dye) = Wee ceeyy 


This proves that ||OU(t) f||p < ||f|lp- 

Next we claim that C?(R“) is dense in L?(R“,). To this end we first approximate 
f by a function of the form yf, where 0 < yw < | and w= 1 on a large enough open 
ball B(0;r) in R@. On each ball B(0;r) the Gaussian density is bounded from below, 
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and therefore convergence in the L?(B(0;r), y)-norm is equivalent to convergence in the 
L?(B(0;r))-norm. Since C?(B(0;r)) is dense in L?(B(0;r)) the desired result follows. 
Combining the above two steps it follows that the operators OU (t) extend uniquely to 
contractions on L?(R4,7). By a limiting argument involving the extraction of an almost 
everywhere convergent subsequence and dominated convergence, the defining formula 
for OU(t) extends to arbitrary functions f € L?(R4,7), in the sense that for every f € 
L?(R4,y) the formula holds for almost all x € R4. 
Next we prove that OU is a Cy-semigroup on L?(IR4, 7). It is clear that (S1) holds. To 
prove the semigroup property (S2) let us first fix a function f € C.(R“). Then OU(t) f € 
C,(IR4) and 


OU(H)OU(s) f(x) = [, OULs)flex+ VI- ey) ay) 

= ff fle etet Vie) + VI) dye) ar) 
=Ef(e x4 ce V1 —e 24 + 1 — eZ), 
where Y and Z are independent standard Gaussians. In view of the identity 

(eoV1 _e- —2t) $O/T= e728)? = {ae 2), 


the random variable e~* 1 — e~2'Y + v1 — e~25Z is equal in distribution to a Gaussian 
random variable with variance 1 — e~2('+*), Therefore 


tf(e x4 eo V/1—e247V 4 V1 e-?5Z) 

=Ef(e (5) 1—e-2(+5)V) 

a | Sle Mat V1 224) y) dy(y) = OU(t +5) f(a). 
R 


This proves the identity OU (t)OU(s) f = OU(t+s) for f € Co(R“). Using the denseness 
of these functions in L?(R4, 7), the identity extends to general f € L?(R“% 7). 

To prove the strong continuity property (S3) we again first consider a function f € 
C,(IR“). By Hélder’s inequality, 


lOUOL—Florain < fi, fylflets+ V1= 29) — Fe)? ar(aay). 


The right-hand side tends to 0 as t | 0 by dominated convergence. This gives strong 
continuity for functions f € C.(IR“). The general case follows again by approximation, 
keeping in mind that the operators OU (t) are all contractive. 

The generator of (OU(t));s0 is traditionally denoted as L. We show next that it is 
given, for functions f € C2(R“), by the Ornstein—Uhlenbeck operator 


Lf (x) =Af (x) —x- Vf (x). (13.30) 
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Before turning to the proof, we wish to point out an interesting feature of this formula. 
Multiplying both sides with a test function @, integrating with respect to y, and integrat- 
ing by parts after having written out the Gaussian density, we obtain 


ED) PCAN) = ae [ (AF) VFC)|@exp(—5 al) ae 
= ~ Ga) om I, Vf: Voexp(—5lr? ) ax (13.31) 


=~ [VF Var) 
R 


Since C>(IR“) is dense in D(V) = W!7(R4, 7), the Gaussian Sobolev space of all f € 
L’(R4,7) whose weak first order derivatives exist and belong to L7(R4,y), for p = 2 
the identity (13.31) identifies —L as the operator associated with the closed, densely 
defined, accretive form agy with domain W!*(IR¢, 7) defined by 


dou (S58) =f, Vf: Vgdy(x). 


Let us now turn to a proof of (13.30). Substituting V1 —e~*'y =u and e'x+u=v 
and writing out the Gaussian density, we arrive at 


1 1 d/2 1 : 
OU (t) f(x) = Gay 3(———s) _ Set i dle iy 
1 1 ve 1 le*x — v|? 13.32 
~ (na? Gere he fv es l-e )av 
= [Mav fev 


This represents OU (t) as an integral operator with kernel 


Mele) 1 ( 1 ee ( 1 aS) 

x,y) = ——,;, | --—— exp{ —=———__ }. 

0) (ana? \1— PL" 9 1 —e-) 

The function M, is called the Mehler kernel at time t. We can express it in terms of 


the heat kernel as M; (x,y) = K 1-2) (e‘x—y), and therefore we have the pointwise 
identity 


where H is the heat semigroup on L?(R¢). 
Let f € C2(IR¢). Then f € W~?(IR“) and therefore, by the results of Section 13.6.c, 
f € D(A). Hence, we may differentiate the above identity at t = 0 and obtain 
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= fe *AaH(Z(1—e*) Fle") —e"(-)-HGG(I-e *)Vf(e)] 
= ALES CIE IC) 


In the middle expression, H(s)Vg is short-hand for wey H(s)0;g. The combined use of 
the product rule and chain rule can be rigorously justified by going through the steps of 


t=0 


the standard proof of the corresponding scalar analogue, which we leave as an exercise 
to the reader. It is important to point out that the limit is taken with respect to the 
norm of L?(IR¢), as we were dealing with the heat semigroup in L?(R¢). However, 
since convergence in L?(IR“) implies convergence in L’(R“,7) it follows that the above 
differentiation can retrospectively be interpreted with respect to the norm of L?(R“,7). 
This proves that f € D(L) and that the asserted formula for Lf holds. 

Let us show next that the analogue of Theorem 12.19 holds for L. We have already 
identified —L as the operator associated with the form agy. This, in combination with 
Theorem 10.44 and Proposition 12.18, implies that 


-L=V*V, 


where the adjoint refers to the inner product of L?(IR¢, 7). Finally, mutatis mutandis the 
argument for the heat semigroup can be repeated to prove that in L7(IR4, 7) the generator 
domain D(L) equals the domain of the weak L?-Ornstein-Uhlenbeck operator which is 
defined in the obvious way. 


Remark 13.56. For 1 < p < ~ it can be shown that 
D(L) =W??(R4Y), 


the Gaussian Sobolev space of all f € L?(IR4, 7) admitting weak derivatives up to order 
2, all of which belong to L’(R“,7). The proof of this fact is beyond the scope of this 
work, even for the case p = 2. 


For d = 1 one has the following representation for the Ornstein—Uhlenbeck semi- 
group in terms of the Hermite polynomials H,, discussed in Section 3.5.b: 


OU (t)Hn =e "Hn, neéN. (13.33) 


This is a simple exercise based on the representation (13.30) and the recurrence relation 
for the Hermite polynomials discussed in Section 3.5.b, and it implies 


o(—L)=N 


(see Corollary 15.59). The formula (13.33) generalises to arbitrary dimension d once 
one has found an analogue of the Hermite basis for L?(IR¢,y). This is accomplished in 
Section 15.6.a (see Theorems 15.57 and 15.58). An immediate consequence is that the 
Ornstein—Uhlenbeck semigroup extend holomorphically and contractively to the open 
right-half plane {z € C : Rez > 0) and is strongly continuous on every sector LX with 
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a< 50. Another way to see this is to note that L is selfadjoint (by Proposition 10.41, 
noting that L is symmetric by the selfadjointness of the operators OU(t) and satisfies 
(0,e¢) C p(—L) by Proposition 13.8); the holomorphic extension to the right-half plane 
may now be defined by 


OU (z) =exp(—zL), Rez>0. 


Following this approach, strong continuity on the sectors Lm with w < 50 will follow 
from Theorem 13.63. This discussion is summarised in the following theorem. 


Theorem 13.57. The operator —L generates an analytic Co-semigroup of contractions 
on every sector X@ with @ < 5X. 


13.6.f The Hermite Semigroup 


LettO0<VeE Li.(R4 ) be given; in applications we think of V as a potential. Consider 
the forms aa, ay : C2 (IR) x C2 (R“) defined by 


aa (u,v) := a ,Vats)WO}dx, av(uy) = [V(xux)v@ak 


The form a, + ay is closable, positive, and symmetric, and its closure a is positive as 
well. The operator A associated with a is densely defined, positive, and symmetric, so its 
closure is selfadjoint by Theorem 12.17. We denote this closure somewhat suggestively 
by H:=—A-+V. 

An interesting special case arises if we take V(x) = |x|*. This results in the selfadjoint 
operator —A + |x|? on L7(R®), the so-called Hermite operator. In Quantum Mechanics, 
the operator 

1 


lo 
H:=—-=A+-— 


is called the quantum harmonic oscillator. Let us here take a closer look at the Co- 
contraction semigroup generated by (a rescaled version of) —H. 


Theorem 13.58 (The Hermite semigroup). The Cy-semigroup S on L?(R“) generated by 
—-H+ a1 is unitarily equivalent to the Ornstein-Uhlenbeck semigroup OU on L?(R4; Y). 
More precisely, we have 


U-'S(t)U =OU(t), t>0, 


where U : L?(IR4,y) > L?(R¢) is the unitary operator given by U = DoE, with D: 
L?(R¢) > L?(R?) and D: L?(R4,7) > L?(R*) given by 


Df (x) = (V2)47 f(v22), 


ES (8) = are env(—la7/4) F0). 
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Proof Fix f € C?(IR¢). Recalling from (13.30) that 
Lf =Af—x-Vf, 
a somewhat tedious but straightforward computation gives the identity 
d 
Lf=u-'(S-H)us. 
f 5 —H)US 


Passing to the resolvents, applying Theorem 13.19, and using the density of C?(R¢) in 
L’ (R4, 7), the identity for the semigroups follows from this. 


As an immediate consequence of this result and Theorem 13.57 we see that the Her- 
mite semigroup extends to an analytic Co-semigroup of contractions on every sector 
with @ < 5%. 

It will follow from Theorem 15.58 that o(—L) equals N = {0,1,2,...} and consists 
of eigenvalues (see Corollary 15.59). From this we see that the spectrum of H equals 


d 


and consists of eigenvalues. The lowest eigenvalue 5d is the ground state energy of H. 


13.6.g The Schrédinger Group 


We again consider the setting of Section 13.6.f. For nonnegative potentials V € Lh. (R4 ) 
we defined the positive operator H := —A+V as the operator associated with the closure 
of the positive symmetric form a := a, + ay, where 


as(u,v) = | Yulx) -WoOa)ae, av(uyv) =f V(x)ulx)@) ak 


for u,v € C2(IR¢). By Stone’s theorem, the operator iH generates a unitary Co-group 
on L?(R¢), the so-called Schrédinger group with potential V. It solves the Schrédinger 
equation with potential V, 


12 ux) = —Au(t,x)+V(x)u(t,x), t>0,xe Rt (13.34) 
i 


The special case V = 0 is of special interest: 
Example 13.59 (Free Schrédinger group). The Co-group (S(t));er on L?(R“) gener- 


ated by the operator A = iA with domain D(A) = H?(R®) is called the free Schrodinger 
group. For functions f € L'(R¢)ML?(R¢) and 0 it is given explicitly be the formula 


1 _ yp 
SOF) = Gaipah i ‘ exp(i ; | ) fOr) dy, (13.35) 


valid for almost all x € R“. Note that, on a formal level, we have S(t) = H(it), where 
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(H(z))Rezso is the (holomorphic extension to the open right-half plane of the) heat 
semigroup generated by A given by (13.21). We refer to Problem 13.12 for a proof that 
the limit H(it) f := lim,)9 S(s + it)f indeed exists for all f € L*(R); the point we are 
making here is that this limit is still represented by the explicit formula (13.21) for the 
heat semigroup evaluated at it. Taking the result of Problem 13.12 for granted, (13.35) 
follows from (13.21), with t replaced by s+ it, by dominated convergence. 

An alternative derivation can be given on the basis of the spectral theorem for self- 
adjoint operators; see Problem 13.23. This idea will be explored more systematically in 
Example 13.64. 

The representation 13.35 implies that S(t) f € L*(IR“) for all f ¢ L'(R“) NL’(R*) 
and t € R\ {0}, with bounds 


1 
4n|t| 


I|S@)fllee < fll, IS@)fll2 <Iflle, 


the former by a direct estimate and the latter by Plancherel’s theorem. By the Riesz— 
Thorin interpolation theorem, for all ¢ € R \ {0} the operators S(t), when restricted to 
L'(R4) AL? (R*), extend to bounded operators from L?(R“) to L7(R®) for all 1 < p <2 
andi 41—= 1, with bound 
Pq 
1 


—oeeeey (13.36) 
(4n\1|) >? 


IS) Ile ceca) acre) S 


13.6.h The Wave Group 


The Wave Group on Bounded Domains Let D be a nonempty bounded open set in 
R¢@. The space H := H4(D) x L?(D) is a Hilbert space with respect to the norm given by 


lv)? = llellt.2 + Iv, 
where we consider the norm on H{ (D) given by 


llulli2 = [ \wulPar, u € Hy (D). (13.37) 


This norm is equivalent to the usual Sobolev norm on Hi (D) by Poincaré’s inequality. 
In H we define the operator A defined by 


A:= ce 5) ; D(A) = D(Apir) x Hy (D), 
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where Apj; is the Dirichlet Laplacian on L?(D). We will prove that A is the generator of 
a unitary Co-group W on H. This group solves the linear wave equation 


2 
Sa (h) =Au(t,x), tER,xED, 
u(O,x) =uo(x), «ED, 
ov 


97 0:4) = vox), = x ED, 


written as a system of first-order ODEs u' = v, v’ = Au, with initial value u(0) = uo, 

v(0) = vo, subject to Dirichlet boundary conditions. 
The operator A is densely defined and an integration by parts gives, for u = 2 € 
uz 


v2 


1wor= (G5) (2) = pn re 9 ams 


= Vis Van Vai os 
D 


— [fu v2 2 
= ( 2) te (ulAv). 
This implies that —iA is symmetric. 


We next observe that 0 € p(Api,) by Theorem 12.26. This allows us to consider the 
bounded operator 


D(A) and v = @ € D(A), 


on H. For (u,v) € H we have 


R (“) = ey: € D(Api:) x Hi(D) = D(A) 


and it is immediate to check that AR = J and RAh = h for h € D(A). This proves that A is 
boundedly invertible, that is, 0 € p(A). An application of Proposition 10.41 now gives 
that —iA is selfadjoint. Therefore, by Stone’s theorem (Theorem 13.46) we obtain: 


Theorem 13.60 (Wave group on bounded domains). The operator A generates a unitary 
Co-group (W(t)):eR on H} (D) x L?(D), provided Hj (D) is endowed with the equivalent 
norm given by (13.37). 
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The Wave Group on R¢ Let us next consider the case D = R%, which is not covered by 
the above considerations since it was assumed that D be bounded. We have Hi (R“) = 
H'(R“) by Theorem 11.25 and Theorem 11.31 and Api; =A with D(A) = H?(R¢). This 
suggests considering in H!(R“) x L?(IR@) the operator 


I 
A:= ¢ D(A) := H?(R%) x H'(R¢). 
We will use the theory of Fourier multipliers to prove that A generates a Co-group on 


H'(R¢) x L?(R¢) and give an explicit expression for it. 
To motivate the upcoming expressions we first consider the matrix 


0 1 
beim ( So): 


where a > 0 is a nonnegative scalar, and compute its exponentials e’4«. The powers of 
Ag are given by 


(FS) (Sun PY 008 


so that with b := al? 


(2k p2k+1 
TE) era ae 
= (2k)! (2k+1)! 
otha — 
k=0 pektl ( yest t* " 
(2k+1)! (2k)! 
Cyt 2k yk Cyt p2k+1 pk 
: (2k)! (2k+1)! 
aa ~(-1) ile 2k+2 (—1) ne pk 
(2k+1)! (2k)! 


_ [{ costb b-!sintb 
~ \-bsinth costb }° 
Substituting —A for a and (—a)!/ > for b, we arrive at the following guess for the ex- 


pression for the wave group: 


cos(t(—A)!/?) (—A)~!/? sin(r(—A)!/?) 
W(t) = , t>0. 
—(—A)!/2 sin(t(—A)!/?) cos(t(—A)!/?) 


We need to give a meaning to the operators occurring in this matrix, which can be 
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accomplished by the functional calculus of —A, or by interpreting them as Fourier mul- 
tiplier operators as follows. The operators on the diagonal can be interpreted as Fourier 
multiplier operators on L?(IR@) associated with the multipliers 


mitx(S) = m224($) = cos(t|§|); 


this function belongs to L* (IR?) with norm 1 for every ¢ € R. Recalling the characteri- 
sation of H!(R“) as those functions f in L?(IR¢) for which € + (1+ |E|?)!/2f(E) is in 
L?(IR¢), we see moreover that cos (t(—A)!/?) maps W!?(IR4) into itself. Recalling the 
norm of H!(R) given by (11.9), this argument also gives the estimates 


aT ata fee DT seantney 2 


Similarly we can associate a bounded operator from H!(R¢) to L?(R“) with the multi- 
plier 

m211(§) = —|6| sin(t/6])- 
Indeed, if f ¢ H'(IR“), then 


n~ 


ras FCEIhrageay < ( sup —*! 


fous Gs ep! sin(t/6||)) IF lla cee) < IIF lla cee) 
eR 


and therefore 


J-ayisin(e(-a)") 


< 
(H(R4),22(R4)) 


for all t € R. In the same way the operators (—A)~!/? sin (t(—A)!/ >) are interpreted as 
bounded operators from L7(IR“) into H!(R%) given by the multipliers 


sin(t/6|) 
gl 


which satisfy (distinguish the cases |§| < 1 and |&| > 1) 


m24(§) = 


ae EFEOVhzaey < ( sap PMD + je) Alone 


< CU +I M Aliza =CO + lei Flliz@ey, 


where C is a universal constant. Therefore 


(-a)-1/? sin(x(—a <C(1+ [el 


ia (L2(R¢),H!(R2)) 


for allt € R. 
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Theorem 13.61 (Wave group on R¢). The operator A generates a Cy-group (W(t))rer 
on H'(IR¢) x L?(R¢) which is given by 


7 cos(t(—A)!/?) (—A)~!/? sin(t(—A)'/?) 
i Le A)!/? sin(t t(— A)1/2) cos(t(—A)!/?) ). tee 


Moreover, there is a constant C > 0 such that 
|Wit)|| <CU+|t)), teER. 


Proof The group property follows from formal matrix multiplication, which can be 
made rigorous by noting that in the Fourier domain we are just multiplying matrices of 
scalar-valued multipliers, much like what we did in the treatment of the heat and Poisson 
semigroups. Once we have proved strong continuity, differentiation of the entries at 


I 
t = 0 identifies i 0) as the generator by the same reasoning. 


To prove strong continuity we note that ||72,1,7||.. < 1 and lim;_,97,1..(6) = 1 point- 
wise, implying that lim;_,9 771 1.1 f= fin L?(R4 ) by dominated convergence. Hence 
lim;-s0 cos(t(— AVF = = f in L’(R*) for all f € L?(R“) by Plancherel’s theorem. The 
strong convergence of the other three terms is proved similarly. 


Remark 13.62. Notwithstanding the linear growth bound for W(t), the energy func- 
tional 


E(t) = |laW()fI2 + IVW(r)fll3 


is constant in time. For functions f € D(A) one has E’(t) = 0 by direct calculation, and 
the general case then follows by density. 


We conclude with some informal remarks establishing a connection with the Pois- 
son semigroup of Section 13.6.d. Since (—A)!/? with domain H!(R®) is selfadjoint, 
i(—A)!/? is the generator of a Co-group (U(t)):er of unitary operators on L?(R?) by 
Stone’s theorem. For f € H(R7) we have (—A)!/?f € H!(IR¢) and 


=U (t) f = [i(—A)" PU (1) f = AU (1) f 


In this sense, tf + u(t,x) = U(t) f(x) satisfies the wave equation with initial condition 
u(0,x) = f(x). We are neglecting the initial condition for the first derivative, however, 
and in fact we could run the same argument for —i(— Ae > which is the generator of 
the Co-group (U(—t))+er to find that its orbits also solve the wave equation with initial 
condition u(0,x) = f(x). Interpreting U(t) and U(—t) as Fourier multipliers one sees 
that 

1 

a( 
which is the first entry in the matrix representation for the wave group. These operators 


U(t) +U(—t)) = cos(t(—a)"””), 
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solve the wave equation with initial conditions u(0,x) = f(x) and 24 57 (0,x) = 0, the latter 
because of the cancellation of the derivatives of U(t)f and U(— ; f. This argument is 
admittedly somewhat sketchy; the reader is invited to provide the rigorous details. 


13.7 Semigroups Generated by Normal Operators 


We begin with a general observation about semigroup generation by normal operators. 
Recall from Section 13.4 the notation 


Lo :={zEC\{O}: |arg(z)| < wo} 
for the open sector of angle @ € (0,7), arguments being taken in (—7,7). 


Theorem 13.63 (Semigroups generated by normal operators). Let N be a normal oper- 
ator in H with associated projection-valued measure P. Then: 


(1) if O(N) is contained in the closed right half-plane, then —N is the generator of a 
Co-semigroup of contractions S, given by 


s(t) = ‘ eM aP(A), 1>0: 
oN) 


(2) if O(N) is contained in a closed sector of angle) <0 < 50, then —N is the gener- 
ator of an analytic Co-semigroup of contractions (S(z))zex, ’ given by 
ae 


—Az 
5) = fy? 'AP(A), <EZ iy 


Proof We give a detailed proof of (1); the proof of (2) is entirely similar. 

First of all, the operators S(t) are well defined and contractive by Theorem 9.8(ii). 
We next check that the semigroup is strongly continuous. For all x € H, dominated 
convergence gives 


lim(S( t)x|x) =tim [ e* dP.(A) = = dP, = (x|x). 


By a polarisation argument, this gives the weak continuity of the semigroup. By Theo- 
rem 13.11, this implies its strong continuity. It remains to be shown that N is its gener- 
ator. If x € D(N), that is, if {yy |A|? dP,(A) <0, then by dominated convergence 


in SGieoo) =lim Z 1 apa) =-f Pah) = —(Na|x). 
o(N 
(13.38) 
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The same argument proves that 


Lim |1S(0)x—xI?? = im 5 (8*(0) —Dste)x—xhx) = fafa) = IN| 
to t HK o(N) 


110 £2 
(13.39) 
using the final identity in the statement of Theorem 10.48 in the last step. 
By polarisation, (13.38) implies that 


1 
lim —(S(t)x—x|y) = (Nxly), x € D(N), ye DWN), 
and hence, using that limsup, jo tIS(Q)x — x|| < co by (13.39), by approximation we 
obtain 


I 
Lee Sy) ee BIN ee 
t 


Denoting the generator of the semigroup by A, for y € D(A”) it follows that 
1 1 

—(Nx|y) = lim —(S(t)x — = lim — (x|S*(t)y—y) = (a|A*y). 

(Nax\y) = tim > (S(¢)x—x\y) =tim “(als"()y-y) = lay) 


This implies that Vx € D(A**) = D(A), referring to Proposition 10.23 for the equality 
of these domains. 

We have thus proved that —N C A. Since (0,°) is contained in the resolvent sets 
of both —N (by assumption) and A (since it generates a Co-contraction semigroup), 
Proposition 10.30 implies A = —N. 


Some of the semigroup examples of the previous section can be constructed rather 
easily using the spectral theorem. 


Example 13.64 (Heat semigroup, Poisson semigroup, free Schrodinger group, wave 
group revisited). Let P be the projection-valued measure on R associated with the neg- 
ative Laplace operator —A, viewed as a selfadjoint operator on L?(R%) (see Problem 
10.17). The heat semigroup H is then given by 


HG)= [e™ara), t>0, 
and the free Schrédinger group by 
S(t) = 4. eM aP(A), 1>0. 
The positive square root (—A)!/ > defined through Proposition 10.58 coincides with 


the unbounded Fourier multiplier operator corresponding to the multiplier m(&) = |&|. 
Using Proposition 10.50 to switch between the projection-valued measure P of —A and 
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Q of (—A) !/2. we see that the Poisson semigroup generated by the latter is given by 
H(t) = i. e*'dO(A) = | e*taP(A), 1>0. 
[0,-°) [0,-°) 


In the same way, the operators cos(t(—A)!/ >) and sin(t(—A)!/ *) featuring in the 
wave group are given by 


cos(2(—A)"/?) = hes cos(td) dQ(A) = io cos(t4!/?) dP(A), 
sin(1(—A)!?) = es sin(ta) dQ(A) = i sin(10'/?) dP(A). 


Example 13.65 (Stone’s theorem revisited). Let P be the projection-valued measure 
on R associated with a selfadjoint operator A on a Hilbert space H. Then the unitary 
Co-group (U(t)),eR generated by iA is given by 


u(t) = fe apa), t>0. 
R 


Problems 


13.1 Let S be a Co-semigroup on X with generator A, and suppose that ||S(r)|| << Me" 
for all t > 0. Prove that 


(A —A)*|| <M/(ReA—p)k, ReA>p, k=1,2,... 


Hint: By considering A — yt instead of A we may assume that js = 0. Under this 
assumption, observe that |||R(A,,A)|| < 1/ReA, where |||x||| := sup,so ||S()x|| de- 
fines an equivalent norm on X. 
Remark: The converse holds as well: If A is a densely defined operator on X 
satisfying the above inequalities, then A generates a Co-semigroup on X satisfying 
\|S(t)|| < Me! for all t > 0. This is the version of the Hille-Yosida theorem for 
arbitrary Co-semigroups. The ambitious reader may try to prove this. 

13.2 The aim of this problem is to prove that if A generates a Co-semigroup S on X, 
then A is bounded if and only if 


lim || S(t) — Z|| = 0. 
im (0) —J| 
(a) Show that if A is bounded, then S(t) = e’4 and lim,)9 ||S(¢) —7|| =0. 


(b) Use a Neumann series argument to prove that if lim,)o ||S(t) — || = 0, then 
for small enough t > 0 the operators 7, € &(X) defined by 


T= ['s) ds 
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are invertible, and show that for such t > 0 we have 


A=T, |(S(t)—J). 


13.3 Let A be the generator of a Co-semigroup S on X. This problem gives a rigorous 


13.4 


13.5 


interpretation to the “formula” “S(t) = e4”, 
For each h > 0 consider the bounded operator A(h)x := i(S (h)x —x). 


(a) Choosing M > | and @ > 0 so that ||S(t)|| < Me® for all t > 0, show that 
t 
e < Mexp| -(e™ — : 
tA(h) M p ; oh 1 


Deduce that for all 0 <h < 1 we have ||e’4\) || < Me!(e?-)), 
(b) Using the identity 


t 
S(t)x—eAMx = | 2 [e’-9)AM) 5(s)x] ds 
0 ds 
deduce from part (a) that for all x € D(A) and 0 <h< 1 we have 
I|S(t)x— 4) x|| < tM2 ele) Ax — A(A)x||. 


(c) Prove that for all x € X and t > 0 we have 


(d) Forn €N withn > @, let A, :=nR(n,A)A as in the proof of the Hille-Yosida 
theorem. Prove that for all x € X and t > 0 we have 


lim e'4"x = S(t)x. 
noo 


Let A be the generator of the Co-semigroup S on L7(0, 1) of left translations, in- 
serting zeroes from the right. Show that o(A) = 2. 


Let A be the generator of a Co-semigroup S on X. The adjoint semigroup on X* is 
the family S* = (S*(t));50, where S*(t) = (S(t))* for t > 0. 


(a) Show that the adjoint semigroup has the semigroup properties (S1) and (S2) 
but may fail (S3). 


Set 
X° = {x* EX": re —x*|| =O}. 
£ 


(b) Show that X© is a closed subspace of X*. 


(c) Show that the adjoint semigroup maps X“ into itself and that its restriction 
to X© is a Co-semigroup. 
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(d) Show that for all x* € X* and t > 0 there exists a unique element @; .* € X* 
satisfying 


t 
(x, Ora*) = [ (x,S*(s)x*) ds. 
(e) Show that for all x* € X* and t > 0 we have @ »*» € D(A*) and 


A* bye = S*(t)x* 3". 


(f) Show that X° = D(A*) and deduce that X° is weak* dense in X*. 
(g) Show that 


1 
D(A*) = 3° EX*: limsup —||S*(t)x* —x*|| < oh. 
ro oF 


(h) Show that if X is reflexive, then S* is a Co-semigroup on X* and A%* is its 
generator. 
Hint: For the strong continuity apply Phillips’s theorem (Theorem 13.11); 
for the identification of the generator use Proposition 10.30. 


13.6 In this problem we prove a continuous analogue of the Sz.-Nagy dilation theorem 
(Theorem 8.36). Let S be a Co-semigroup of contractions on a Hilbert space H. 


(a) Show that the mapping T : R > &(H) defined by 


S(t), t>0, 
T(t):= 41, t=0, 
(S(—t))*, 1 <0, 


is positive definite. 
Hint: For t),...,ty € Q use Lemma 8.35 to show that for all 11,...,hy € H 
we have Ee si (T (th —tn)hm|hn) > 0. 

(b) Show that there exist a Hilbert space H containing H as a closed subspace 
and a Co-group (U(t));ez on H such that 


T(t}hh=PU(t)hh, t>0, hed, 


where P is the orthogonal projection of H onto H. 
Hint: Combine the result of part (a) with Theorem 8.34 to obtain the dilation 
and use Theorem 13.11 to prove its strong continuity. 


13.7 Let A be the generator of a Co-semigroup on X. Prove the following spectral 
inclusion formula: for all t > 0 we have 


o(S(t)) C exp(to(A)). 
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Hint: Suppose e* € p(S(t)) for some A € C and denote the inverse of e*"7 — S(t) 
by Q, ,. Show that the bounded operator By defined by 


t 
Byx:= . "9 S(s)O, .xds 
i 


is a two-sided inverse for A — A. 

13.8 Let A be the generator of a Co-semigroup on X, and let f € L'(0,T;X) be given 
and fixed. A function u € L'(0,T;X) is said to be a weak solution of the inhomo- 
geneous abstract Cauchy problem 


u(t) =Au(t)+f(@), 1€ [0,7], 
u(0) = uo, 
if for all  € [0,7] and x* € D(A*) we have 


(u(t), x") = (uo,x") + [(u(s),Arxyds+ ['(F(6),2" as 


Prove that the inhomogeneous abstract Cauchy problem has a unique weak solu- 
tion, and that it equals the unique strong solution. 
13.9 On C? consider the norm || - ||g induced by the inner product (x|y)g := (Qx\y), 


where 
1. 2 
o=(; 4). 


(a) Show that the symmetric matrix Q is positive and conclude that (-|-)g does 
indeed define an inner product on C2. 


On (C?, || - ||o) we consider the Co-semigroup S, 


S(t) =e"? & ) 


(b) Show that ||S(z) a = et (t? +2+1vVt? +4) and conclude that S(t) is con- 
tractive for all t > 0. 


Hint: Use the fact that IIS(IIG equals the largest eigenvalue of S(t)S*(t) (see 
Problem 4.14), where the adjoint refers to the inner product (-|-)g. 


(c) Show that S extends to an entire Co-semigroup which is uniformly bounded 
on the open sector Ly for allO <9 < 50. 
(d) Show that S fails to be contractive on any open sector Ly. 


13.10 Let A be the generator of an analytic Co-semigroup on X. Show that if B is a 
bounded operator on X, then A+B generates an analytic Co-semigroup on X. 
Also show that if A generates a bounded analytic Co-semigroup, then so does 
A+B-— ||B\|I. 
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Hint: First prove the second assertion. 

13.11 Let —A be the generator of an analytic Co-contraction semigroup on a Hilbert 
space H. Show that the form a on H with domain D(a) := D(A) defined by 
a(x,y) := (Ax|y) is accretive, continuous, and closable. 

13.12 Let A be the generator of a Co-semigroup Co-semigroup S on X which is bounded 
analytic on the open right-half plane. Show that for all t € R and x € X the limit 


T (t)x := lim S(s + it)x 
50 


exists and that the family (7 (t)),eR is a uniformly bounded Co-group of operators 
with generator iA. 
Hint: Begin by observing that if 0 < 5’ <5 <0 and —2%0 < t! <t <e, then 


|S(s + it)x — S(s' +it)x|| < M||S(s—s')x—x||, 


where M = supp,-s0 ||S(z)||. Deduce from this the existence of the limits. Deduce 
the semigroup properties and strong continuity in a similar manner. Finally show 
that if x € D(A), then 


[ T(s)iAxds = T(t)x—x 
0 


to deduce that x € D(B), where B is the generator of (U(t));er, and use this to 
conclude that B = iA. 

13.13 Let A be the generator of an analytic Co-semigroup S on X. Prove that if o(A) C 
{z€C: Rez < 0}, then S is uniformly exponentially stable, that is, there exists 
an exponent @ > 0 such that sup,s9 e®'||S(t) || <o. 

Hint: Verify the assumptions of Theorem 13.30 for @+A for small enough @ > 0. 

13.14 This problem gives a two-dimensional example of a bounded analytic Co-semi- 
group which is uniformly exponentially stable, contractive on R41, and fails to be 
contractive on any open sector containing R+. 

13.15 Let (S(t)):50 be a Co-semigroup on X and let 1 < p < c. Prove the Datko—Pazy 
theorem: The semigroup (S(t) );>0 is uniformly exponentially stable (see Problem 
13.13) if and only if the orbit t > S(t)x belongs to L?(R+;X) for all x € X. 

Hint: Apply the uniform boundedness theorem and reason by contradiction. 

13.16 Let A be the generator of a Cp-semigroup (S(t));+0 on a Hilbert space H. Prove 
the Gearhart-Priiss theorem: The semigroup (S(t));>0 is uniformly exponentially 
stable (see Problem 13.13) if and only if {A © C: ReA > 0} C p(A) and 


sup ||R(A,A)|| < ©. 
Rea>0 


Hint: Suppose that ||S(+)|| < Me®'. Complete the following steps: 
(a) Extend the Fourier-Plancherel theorem to L?(IR¢;H). 
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(b) Prove that t+ R(s+it,A)x belongs to L?(R;H) for all x € H ands > @. 

(c) Use a Taylor expansion argument for the resolvent operators near the imagi- 
nary axis to prove that there exists a 6 > 0 such that {A € C: Rea > —5} C 
p(A) and 


sup ||R(A,A)|| <e. 
ReA>—6 


Hint: Compare with the proof of Lemma 13.33. 

(d) Use the resolvent identity to prove that the function t ++ R(it,A)x belongs to 
L?(R;A) for all x € H. 

(e) Conclude that ¢ +4 S(t)x belongs to L?(R;H) for all x € H. 


13.17 This problem shows that the Gearhart—Priiss theorem of Problem 13.16 does not 
extend to general Banach spaces. 
Let 1 << p<q< and let X := L?(1,0) L4(1,0). This space is a Banach 
space under the norm || f|| := max{|| fp, || f||¢} (see Problem 2.21). On X define 
the operators S(t), tf > 0, by 


(S() f(x) = fee’), x>1. 
(a) Show that S is a Co-semigroup on X with generator A given by 
D(A) :={f EX: xr xf" (x) € X}, 
(Af)(x) :=xf'(x), x>1, f € D(A). 
(b) Show that {Rea > —1/q} C p(A) and that for all @’ > —1/q we have 


sup ||R(A,A)|| <2. 
ReA>a! 


(c) Show that for all @ < —1/p we have limy_,.. e~ '||S(t)|| = 2. 
13.18 Forx € X define 
O(x) = {x" EX™: |x" |] = [lal], (27) = [lal - 
This set is called the subdifferential of x. 


(a) Show that 0(x) 4 ©. 
(b) Show that if X is a Hilbert space, then for all x € X we have 


A(x) = {x}. (13.40) 


(c) Let 1 <p <coand 5+ / = 1. Show that if X = L?(Q), then for all f € L?(Q) 
we have 


O(f) = {rv}, (13.41) 
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where gy» € L4(Q) is defined by writing f(@) = e'°(®)| f(@)| and setting 
Br p(@) =e | F(@)|P1. 


(d) Is the subdifferential always a singleton? 
(e) Using subdifferentials, extend the Lumer—Phillips theorem (Theorem 13.35) 
to Banach spaces. 


13.19 Discuss Examples 13.38 and 13.41 for Neumann boundary conditions. 

13.20 Prove the identity (13.33). 

13.21 Consider the wave groups W on a bounded open set D C R¢@ (as in Theorem 

13.60) or on the full space D = IR? (as in Theorem 13.60) and denote their gener- 
ators by A. Prove that if f = (u,v) € D(A), then the solution of the wave equation 
in the sense of semigroup theory, that is, the mapping t + W(t) f, belongs to 
C?(R;L?(D)) NC(R;H?(D)). 
Hint: Let # = H}(D) x L?(D) be the Hilbert space on which the wave group 
acts. Start from the general observation that if f € D(A), then t+ W(t) f belongs 
to C!(R; #7) C(R;D(A)); this follows from general semigroup considerations. 
Then use the special structure of the wave operator A. 

13.22 This problem gives some perspective on the bound ||W(rt)|| < C(1 +1) for the 
wave group over the domain R@ (Theorem 13.61). Let A be its generator. 


(a) Is the operator —iA selfadjoint? (Compare with Theorem 13.60.) 
(b) Show that A —/ satisfies the conditions of the Lumer—Phillips theorem for all 
€ > Oif we endow H!(R®) with the equivalent norm 


lalla = fl +(Vuldx, we "CR. 


(Compare with (13.37).) Conclude that with respect to the resulting equiva- 
lent norm || - ||| on H1(IR@) x L?(R®) we have |\|W(t)||| < e! for all t > 0. How 
does the norm ||| - ||| compare to the norm used in Theorem 13.61? 

(c) Elaborating on the idea of part (b), show that for all € > 0 the space H | (R?) x 
L?(R“) admits an equivalent norm || - |e such that |||W(t)|lle < e& for all 
t>0. 

13.23 Derive the formulas (13.21) and (13.35) for the heat semigroup and the free 
Schrédinger group from their representations in terms of the projection-valued 
measure associated with the Laplace operator (Example 13.64). 
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Trace Class Operators 


This chapter is devoted to the study of trace class operators and the related class of 
Hilbert-Schmidt operators. In a sense that will be explained in the next chapter, we 
can think of positive trace class operators and the trace as noncommutative analogues 
of finite measures and the expectation. After proving some general properties of trace 
class operators, we compute traces in a number of interesting examples. 


14.1 Hilbert-Schmidt Operators 
Throughout this chapter we assume that H is a separable complex Hilbert space. 


Definition 14.1 (Hilbert-Schmidt operators). A bounded operator T € &(H) is called 
a Hilbert-Schmidt operator if 


Y lIThall? <% 


n>1 
for some (equivalently, for every) orthonormal basis (hy)n>1 of H. 


To see that this definition is independent of the orthonormal basis (/1)n>1, let (Hi, )n>1 
be another orthonormal basis of H. If 7, and 7 are Hilbert-Schmidt, then 


y (Ti hn|T2hn) = os x (Ty Prn| ty) (ye| Tarn) 
n>1 n>1k>1 


=P VY Ta lhn) nl TH) = Yo (TA ITH). 


n>lk>1 k>1 


(14.1) 


Using this identity with h, replaced by Hi, 


Ye (TM ITH, = Y (Th, |B). 
k>1 n>1 
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Taking T; = T, = T we infer that for a Hilbert-Schmidt operator T, the quantity 
Ilhan = (Eira?) 
n>1 


is independent of the orthonormal basis (Ay ),>1 of H. It is clear that 


ITI < ITllace 
For g,4 € H we recall the notation g @/h for the operator on H defined by 
(g@h)x:=(x|A)g, x EH. 


Example 14.2 (Finite rank operators). Every finite rank operator is a Hilbert-Schmidt 
operator. Indeed, by a Gram—Schmidt argument we may represent T as 


k 
Tes Yi gj Shy 
j=l 


with g1,...,g, € H orthonormal in Hf and fy,...,h, € H. Completing to an orthonormal 
basis (gj) j>1, we have 


¥ reli? => ¥ lean - Ele = Sh AP 


n>1 n>1 j= 


Example 14.3 (Integral operator with square integrable kernel). Let (Q, UW) be a o-finite 
measure space and let k € L?(Q x Q,u x 11) be given. Then 


s) = [ ks.n faut), SEQ, 


defines a Hilbert-Schmidt T operator on L?(Q, 11), for if (hy) >1 is an orthonormal basis 
of L?(Q, u), then 


IP gan =X [| f,eotn(auco| aes) 
= I y 


= f(s, cay) 409) = IB axouxn 


As a special case, any (d x d) matrix A = (ajx)1<j,¢<a is a Hilbert-Schmidt operator as 
a linear operator on C4 and 


iE K(s.t)ha(tjau(e)| dus) 


2 2 
|All, (cn = Y |a jx". 
1<j,.k<d 


A converse to this example will be stated at the end of this section. 
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Proposition 14.4. The space Ly(H) of all Hilbert-Schmidt operators on H is a Hilbert 
space with respect to the inner product 


(Ti|72) = Yo (TiMn|TaIn), 


n>1 
where (hn)n>1 is any orthonormal basis of H. 


Proof tis elementary to check that (T,|72) := L,51 (Titn|T2hn) defines an inner prod- 
uct. Its independence of the choice of the basis follows from (14.1). 

The triangle inequality in (7 implies that )(H) is a normed space. To prove com- 
pleteness, suppose that (Z,)n>1 is a Cauchy sequence in %(H). Then (J,)n>1 is a 
Cauchy sequence in “(H). Let T € &(H) be its limit. If (hj) ;>1 is an orthonormal 
basis for H, then for all n > 1 we have 


n n 
2} Wet 2 3s 
» \|TAjl° = lim), [Terj||° < him || Tell, can) <e- 
Upon letting n — ©, it follows that T is a Hilbert-Schmidt operator and 


IF lesen) < fim Tell aye) 


Also, 


Ms: 


m 


n 
I|(Ze— T)hg||? = Tim || Tie — Tn)hl|? < tim sup || Te — Tall ’y¢42- 
j=l j=l eae 
It follows that limy—o |Z — T||_ x (4) < limp lim sup, +00 || Te — Tin||_2,(H)- Since the 
latter tends to 0 as k — it follows that limy,.. 7, = T in (H). This proves com- 
pleteness. 


Proposition 14.5. Every Hilbert-Schmidt operator is compact and can be approxi- 
mated, in the Hilbert-Schmidt norm, by finite rank operators. 


Proof Let T be a Hilbert-Schmidt operator on H, let (An) n>1 be an orthonormal basis 
for H, and denote by Py the orthogonal projection onto the span of {/,...,4y}. Then 
PyT is a finite rank operator and hence Hilbert-Schmidt, and we have 
lim sup ||PyT — T||* < limsup || PyT — ales (a) = limsup y || Thy ||? = 0. 
N—-co N- co 2 N- oo n>N+1 
Each PyT is a finite rank operator, hence compact. Since uniform limits of compact 
operators are compact, it follows that T is compact. 


Proposition 14.6. A bounded operator T © &(H) is a Hilbert-Schmidt operator if 
and only if T* is a Hilbert-Schmidt operator, and in this case we have ||T || _4(4) = 


IT* lla: 


510 Trace Class Operators 


Proof This is immediate from (14.1). 


Hilbert-Schmidt operators have the following ideal property: 


Proposition 14.7. If T is a Hilbert-Schmidt operator and S and U are bounded, then 
STU is a Hilbert-Schmidt operator and 


STU | ai < IISINT aan llg |. 
Proof Itis clear that ST is a Hilbert-Schmidt operator and 
IST ll 24) < ISIN ac. 


Applying this to U* and T* using Proposition 14.6, it follows that U*T™* is a Hilbert— 
Schmidt operator and 


|U*T" cy SNOUT avy = NMT ac 
Then TU = (U*T*)* is a Hilbert-Schmidt operator and 
TU || ai) = 10°F" Laan SMU MIIT acn- 


Using the first step once more, this implies that STU is a Hilbert-Schmidt operator and 
satisfies the estimate in the statement of the proposition. 


Theorem 14.8. Let (Q,.F, 1) be a separable measure space, that is, the Hilbert space 
L?(Q, U1) is separable. If T € Ly(H), there exists a unique k € L?(Q x Q,u x ML) such 
that for -almost all @ € Q we have 


Tf(@) = [ kKo,0')f(o))du(o"). 


The proof of this theorem will be given in the next section. 
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14.2.4 The Singular Value Decomposition 


Recall that a bounded operator T € (H) is called positive if (Th|h) > 0 for all h € H. 
Every positive operator is selfadjoint. 


Definition 14.9 (Trace, of a positive operator). The trace of a positive operator T € 
-£(H) is the nonnegative extended-real number defined by 


tr(T) := by (Thy|hn), 


n>1 


where (A, )n>1 is any orthonormal basis of H. 
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To see that tr(7’) is well defined, suppose that (h,)n>1 and (H,)n>1 are orthonormal 
bases of H. Then, by selfadjointness, for any N > 1 we have 


N 
L (TH |hi,) = y Ye (Thi, |hj)(hj|h,) 
et 


n=1j21 


=V YS (Thylht,) (hi, \h;) (Th;\hj) (Thj\hj). 


j> | 


a 


jein=1 


— 


This gives the bound ¥°,5;(Th),|/),) < Ljs1 (Thj\h;). The converse bound is obtained 
by interchanging the roles of the two bases. 


Definition 14.10 (Trace class operators). A bounded operator T € &(H) is called a 
trace class operator if its modulus |T| := (T*T)'/* has finite trace. 


Proposition 14.11. [fT € @(H) is a trace class operator, then \|T|| < tr(|T]). 
Proof This follows from 


[7 =[7°7" = sup (774) 
ae. 7 1Al] = IZ II = ae (|7|A|h) <tr((7\), 


hl|=1 I|A||<1 


using the identities of Proposition 4.28 and Theorem 8.11. 


Example 14.12 (Finite rank operators). Every finite rank operator is a trace class oper- 
ator. This follows quite easily once we have proved, in Theorem 14.22, that the set of 
trace class operators is a linear subspace of “(H). Indeed, taking this result for granted, 
we only need to observe that each rank one operator of the form T = g®h is a trace 
class operator. To this end we normalise || g|| = ||/|| = 1. From 


(x|T*y) = ((g@A)aly) = (alh)(gly) = Ql(ely)”) = Cl(lg)h) = (4B a)y) 
we deduce that T* = h® g. This implies 
T*Tx = (h®g)(g@h)x = (h® g)(a|h)g = (a|h)(glg)h = (|h)h = (h@h)x, 


proving that T*T = h@h. This is an orthogonal projection, so it equals its own square 
root; hence |T| = h®h and tr(|T|) = 1. Undoing the normalisation, this gives 


tr(|7|) = IlgilllAl- (14.2) 


It is instructive to present a direct proof that finite rank operators are trace class which 
avoids the use of Theorem 14.22, if only because it provides an explicit expression for 
the trace of |7'. 

Let us write H = V @V+ with V := (N(7))* finite-dimensional, and observe that 
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T*T maps V into V and vanishes on V+. Indeed, for v € V and vt € V4 = N(T) we 
have (T*Tv|v-) = (Tv|Tv+) = 0 since Tv+ = 0, so the range of T*T is orthogonal to 
V~ and hence contained in V; the second part of the claim is trivial since Vt = N(T). 

The subspace W := R(T) is finite dimensional. Denoting by Tyw the restriction of 
T to V, viewed as a mapping into W, we have T*T|y = Ty Tvw. Indeed, this follows 
from the fact that for all v,v’ € V we have 


(T*Ty|v')y = (T*T |v) = (Tv|T Vv) =S (Twvl|Tvwv )w = (Tw Tvwvl\v )v 


and the claim follows. Choose an orthonormal basis (/; i ‘_, for V and complete it to an 


orthonormal basis (hj) j>1 for H by adding the vectors in an orthonormal basis for V+. 
If j>k+1, thenh; €V+ =N(T), so 


lz \r,||° = (7°72 j|(TA7)!7h;) = (T*TAj|hj) = 
Therefore, 
k k 
DCI Uh sls) = Ye (IT Mets) = (Tew ew hj) = te Bw Few)') <n. 
jel j=l j=l 
Proposition 14.13. Every trace class operator is compact. 


Proof First assume that T is a positive trace class operator. Let (Ay ),>1 be an orthonor- 
mal basis of T. Then from 


yr hal? = Y (Thala) = te(T) < 
n>1 n>1 
we see that T!/? is a Hilbert-Schmidt operator and therefore compact. Hence also T = 
(T'/2)? is compact. 
In the general case let T = U|T| be the polar decomposition of T, with U an isometry 


from R(|Z|) onto R(T) (see Theorem 8.30). Since the positive operator |T| is a trace 
class operator, |T| is compact, hence so is T. 


Let T be a compact operator, with polar decomposition T = U|T|. Viewing U as an 
isometry from R(|7|) onto R(T), its adjoint U* is an isometry from R(T) onto R(|T| 
satisfying U*U = I, and consequently |T| = U*T. It follows that |T| is compact. 


Definition 14.14 (Singular values). The singular values of a compact operator T on H 
are the nonzero eigenvalues of the compact operator |7. 


Since |7| is positive, every singular value is a strictly positive real number, and since 
|T| is compact, the set of singular values is finite or countable with 0 as its only possible 
accumulation point. We may therefore think of the set of singular values as a nonin- 
creasing (finite or infinite) sequence (A, )n>1. This sequence, where each A, is repeated 
according to its multiplicity, is called the singular value sequence. 
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According to the singular value decomposition for compact operators (Theorem 9.2), 

every compact operator T € (H) admits a decomposition 
T= » An&n @hp 
n>1 

with convergence in the operator norm, where (A,,)n>1 is the singular value sequence 
of T and (gn)n>1 and (Ap_)n>1 are orthonormal bases in H. The following theorem char- 
acterises trace class and Hilbert-Schmidt operators in terms of the sequence (A,)n>1. 
In order to state the two cases symmetrically, we use the notation ||T||_¥,(#) := tr(|T|). 
In the next section we prove that the set 2 (H) of all trace class operators on H is a 
Banach space. 


Theorem 14.15 (Singular value decomposition). Let T € &(H) be compact, and let 
(An)n>1 be its singular value sequence. Then: 


(1) T is a trace class operator if and only if Y,51 An < ©. In this case we have 


IT lg, = Yo An 


n>1 


(2) T is a Hilbert-Schmidt operator if and only if n> A? < oo, In this case we have 
2 2 
IT un = LAr 
n>1 
In either case we have 
T= > Ann ® hy 
n>1 
where (8n)n>1 and (hy)n>1 are orthonormal bases in H, with convergence in the norm 
of Z(H) in case (1) and convergence in the norm of Z(H) in case (2). If T is positive 
we may take (8n)n>1 = (Mn)n>1. 
Proof (1): Let £1 > U2 >... be the sequence of distinct nonzero eigenvalues of |7|. 
Since |7| is selfadjoint, the eigenspaces Y; corresponding to [lz are pairwise orthogonal, 
and since |T| is compact they are finite-dimensional, say dim(Y,) =: dg. By the spectral 
theorem for compact selfadjoint operators (Theorem 9.1) we have 
IT]= Yo oP 
k>1 
with convergence in the operator norm of (H). Choosing orthonormal bases (hi ye 1 


for Y, we may write P, = Bd 1 hi 2) hi and 


IT| = Ee Lent, 


k>1 j=l 
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again with convergence in the operator norm of “(H). Fixing an orthonormal basis 
(Hi,)n>1 Of H, it follows that 


d 
we(I7 1) = LTIm IH) = TY ae Ena 


n>1 n>1k>1 j=l 
=T aS DiaKuye = VY due = Y An. 
k>1  j=in>1 k>1 n>1 


This gives the first assertion. To prove convergence in the norm of 4 (H) we note that 


y An&n R n 


n>N+1 1H) 


Z(H) 


|P-¥ nen Oh 
n=1 


< YE align @hall giz) = YE An| +0 asN +o, 
n>N41 n>N+1 


making use of the fact that || gn @/n||_,(4) = 1 by (14.2). 


(2): With the notation of (1), the doubly indexed sequence (nk )k>1,1<j<a, 1S an or- 
thonormal basis for Bs Y; and we have 


dk 

EE rie = E Lara = XL ni = Lae 
k>1 j= k>1j= k>1 j=1 n>1 

Since Th = 0 for h € Yo := N(T), this gives the first assertion. Convergence in the norm 

of £4(H) is proved by testing against the orthonormal basis (gn)n>1: 


N 2 
T—Y Angn@®h Ann & 
| py n&n n D(H) Lae n&n »(H) 


2 


Y [Anl7|(gnlfn)[2< S |anl2 +0 as No, 


n>N+1 n>N+1 


At this point we briefly pause to insert a proof of Theorem 14.8. 


Proof of Theorem 14.8 Let T € Z4(L7(Q,1)) be given, with L?(Q,) separable. By 
Theorem 14.15 we have 


T= y Ann @hn, 


n>1 


where (gy)n>1 and (My)n>1 are orthonormal bases of L?(Q, 1) and the nonnegative real 
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numbers A, > 0 satisfy £5142 < 0°. Then, for f € L?(Q,) and p-almost all @ € Q, 
Tf(@) = » An( Flin) 8n(@) 


n>1 
where 


is square integrable since 
J, [@.0)Pau(o) qu(o') 
= [LLL Antntn()8n()Fn( hn ( 0") du(o) du(o") 


Qmsind1 
= And Galen) Galtint =) ae SS. 
m>\n>1 n>1 


Returning to the main line of development, Theorem 14.15 permits us to extend the 
trace to arbitrary trace class operators. 


Theorem 14.16 (Trace). Jf T € L(A) is a trace class operator, for any orthonormal 
basis (In) n>1 of H the sum 
tr(T) := y (Thalhn) 


n>1 


converges absolutely, the sum is independent of the choice of the basis, and 


|tr(7)| < er(|7)). 
Proof Let T =U|T| be the polar decomposition of T, with U a partial isometry from 
R(|7|) onto R(7). First consider the special case where (M,)n>1 is the orthonormal 
sequence of eigenvectors of |T| obtained by combining the sequences (ak) jp k2 1, 
from the proof of Theorem 14.15 into one sequence, so that |T|h = ae An(hlhin)hn 
Then, 
¥ l(Thaltn)| =X [UIT Vhnltn)| = YE An|(UAalfn)| < Yan = (IP). 
n>1 n>1 n>1 n>1 
It follows that tr(T) = ¥,51(Thn|/tn) with absolute convergence and |tr(T)| < tr(|T]). 
Now let (A/,)n>1 be an arbitrary orthonormal basis. Then, with the notation as above, 


Le M(Php len) = Ye | Ye (Thy le) (eel) < YY |(Dheg| rx) (hel | 


n>1 n>1 k>1 n>1k>1 
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=P Y UA |The) (Alt) = Yo ae Y [UA he) (hal) | 
k21n21 


k>1onel 


(Uni |h (iy\h,)| 

<¥a (ZI InP) (EK ‘| ad 
= Vi Agl|U* had [lull < Ye Ae 
k>1 


k>1 


This shows that Y,5;(TAi,|h’,) is absolutely summable. Moreover, 


Yi (while) = VY Pr a) al) = VY (TA Me) Aad) 


n>1 n>1k>1 k>1n>1 
=P PY Thr) (hh) = Y (T*helhe) = VY (The|he), 
k>1n>1 k>1 k>1 


where the change of summation order is justified by the previous estimates, which imply 
the absolute summability of the double summations. 


Definition 14.17 (Trace, of a trace class operator). The trace of a trace class operator 
T € £(H) is defined by 


tr(T):= ry (Thyl|hn), 


n>1 
where (A, )n>1 is any orthonormal basis of H. 
By Theorem 14.16, the trace is well defined. 
Example 14.18 (Finite rank operators, continued). The trace of a finite rank operator 
T =, 8n®hn is given by 
N 7 N 
T)= >» tr(gn@ hn) = L (8n|hn). 
n=1 n=1 


Here we use the fact that the trace of a rank one operator g®h may be evaluated in 
terms of an orthonormal basis (//,)n>1 chosen such that h, = h/||h|| to give 


tr(g@h) = Y° (gly) (hl) = (glh)- 


n>1 


If P is a (not necessarily orthogonal) projection onto an N-dimensional subspace, 
then 


tr(P) =N. 
To see this we write P = ae 8n@hy with g1,..., gy orthonormal. From Phy = ||hn II 8n 


and P2/tn = |\hn||* (gn\hn)gn we deduce that (g,|hy) = 1 and the result follows from the 
first part of the example. 
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More interesting examples will be given in Section 14.5. 
We prove next that the set 4 (H) of all trace class operators on H is a vector space 
and, endowed with norm 


IIT gay = t(IT)), 


a Banach space. We begin with the proof that “(H) is a vector space. It is evident 
that if T is a trace class operator, then so is cT for all c € C and tr(|cT|) = |c|tr(|T]). 
Additivity is less trivial and is based on a characterisation of trace class operators which 
we prove first. The crucial ingredient is the following lemma. 


Lemma 14.19. Let T € @(H) be compact and let (An)n>1 be its singular value se- 
quence. Then for all n > | we have 
y= = sup) Do (Tgj\h,)| = supP|Tejlth 
jel 
where the suprema are taken over all finite orthonormal sequences g = (g ak _, and 
h= (hija in H. 
Here we allow the possibility that all three expressions are infinite. 
Proof Without loss of generality we assume that T ¢ 0. Let e; be a normalised eigen- 


vector for |T| with strictly positive eigenvalue 2;. Consider a polar decomposition 
T =U|T|, with U a partial isometry which is isometric from R(|T|) onto R(T). Then, 


n Fai n 
YA = LI |Tlejlej) = dL U* Tej\ej) = 
j= j=l 


j=l 


n n 

y" (U*Te jle;) |= LI (Tej|Ue;)], 
jal j=l 
where in the second identity we used that (x|y) = (Ux|Uy) for all x, y € R(|T|) and in the 
third that all A; are positive. For the same reason, (Ue iH is an orthonormal sequence. 
This gives the two inequalities ‘<’. 

In the converse direction, let two orthonormal sequences (g;)'_, and (h;)_, in H 

be given. Then, with the above notation, repetition of the second part of the proof of 
Theorem 14.16 gives 


¥ (rele) = YY Cell lee)(exle) 


AY M(g iV ex) (ex\A3)| 
ist jal 


cZe(biewa) "(mb 


Ag||Uexllllexl| < YY Ax. 


Sh k>1 
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This concludes the proof of the equalities. 


Theorem 14.20 (Trace class operators). For a bounded operator T € &(H) the follow- 
ing assertions are equivalent: 


(1) T is a trace class operator; 
(2) we have 


sup ¥ (Tailh)| <~, 
gh °jzl 


the supremum being taken over all orthonormal sequences g = (gj)j>1 and h= 
(hj) j21 Of H; 
(3) we have 


sup Y° |(Tgi\hj)| <=, 
gh j>l 


the supremum being taken over all orthonormal sequences g = (gj)j>1 and h= 
(hj) joi OH. 


In this situation, the suprema in (2) and (3) are in fact maxima, and we have 


Y (Teilh))| me Yl (Tgilhy)I- 
8; 


jz jz 


IIT ll a cay) = sup 
gh 


Proof The equivalences follow from Lemma 14.19, which also gives the equalities in 
the final assertion of the theorem. To see that the suprema are in fact maxima, consider 
the sequences given by the singular value decomposition of Theorem 14.15. 


The trace class condition is stated in terms of summability of its singular value se- 
quence. As a first application of Theorem 14.20 we show that if an operator is a trace 
class operator, its eigenvalue sequence is summable as well: 


Proposition 14.21. For any trace class operator T € L(H), with eigenvalue sequence 
(An)n>1 repeated according to algebraic multiplicity, we have 


Vln! < IITIl.¢, a: 

n>1 
Proof By the multiplicativity of the holomorphic calculus, the spectral subspaces X, = 
R(Py) corresponding to nonzero points  € o(T), defined as in Theorem 6.25, are 
pairwise orthogonal. Moreover, by Corollary 7.13, each X,, is finite-dimensional. Since 
X, is invariant under T, the restriction of T to X, can be brought into Jordan normal 
form by selecting a suitable orthonormal basis (xj ee , for X,. Here, v, = dim(X,) is 
the algebraic multiplicity of u. It follows that 


wow [Tey near a 
TX; = CjX] Ho FC; ix; with cj; =H; j= 1,..., Vp. 
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We then have (7x5 Ix ) =m. Hence, if (u;);51 denotes the (finite or infinite) sequence 
of distinct nonzero eigenvalues of T, we have 


Y lanl = Ey l= Evie <|ITllaay 


n>1 iz1 j= iz1j= 


the last inequality being a consequence of Theorem 14.20. 


We continue with some results about the structure of the set of trace class operators. 


Theorem 14.22 (Sums). Jf T; and Ty are trace class operators on H, then so is their 
sum T; + Ta, and we have tr(T + Tz) = tr(T1) + tr(T) and 


tr(|7i + To]) < e(/Ti]) + (72). 


Proof Let (An)n>1, (Mn)n>1,. and (Vp)n>1 denote the singular value sequences of 7), To, 
and T; + To, respectively. Applying Theorem 14.20 first to the compact operator T; + Ty 
and then to 7; and 7> separately, we obtain 


ys V, = sup a (T + Ta) gn|hn) 


n>1 gh 'n>l 
< sup] ¥(Tigala)| +sup| Yo (Fagalla)| = Yo ant YY to 
n>1 gh in>1 n>1 n>1 


where the suprema are taken over all orthonormal sequences g = (gn)n>1 and h = 
(An) n>1 ind. 


Theorem 14.23 (Completeness). The normed space Y(H) is a Banach space, and the 
finite rank operators are dense in this space. 


Proof Suppose (T,)n>1 is a Cauchy sequence in .%(H). Then (T,)x>1 is a Cauchy 
sequence in “(H). Let T € &(H) be its limit. Since each 7, is sea en so is T. 
Moreover, for all orthonormal sequences (g;) ;>1 and (hj)j>1 in H and all k > 


k k 
yt \(Tgj\h,;)| im a (Tne j\h;)| < sup|lTn lA (a) <. 
jal 


Therefore Theorem 14.20 implies that T is a trace class operator. Also, 
k k 
Al (Tn — Tg |hj)| me It (Tn — Tin) 8 j\hj)| < lim sup || Tn — Tn || 2, 2) 
j=l j=l mo 


Again by Theorem 14.20, this implies that 


\| Tn — T ||, < limsup || Tn — Tall 2, (27) 
m— co 


520 Trace Class Operators 


Since the latter tends to 0 as n > ©, it follows that lim,_,.. 7, = T in 4 (H). This proves 
that 7 (H) is complete. 

To prove that the finite rank operators are dense it suffices to show that if T is a 
trace class operator, then the convergence in part (2) of Theorem 14.20 also takes place 
with respect to the norm of 4 (H). The partial sums in (2) are Cauchy in the norm of 
-£,(H) thanks to the absolute summability of the sequence (Ay)n>1 and the fact that 
lg @h||_z, (4) = |lgll||A\|. Therefore, by the completeness of %(H), the partial sums 
converge to some operator TED, (H); but since we know already that the sum con- 
verges to T in Y(H), we must have T = T. 


Trace class operators have the following ideal property: 


Theorem 14.24 (Ideal property). Jf T is a trace class operator and if S and U are 
bounded, then STU is a trace class operator and 


STU | ga) < IISINT aan lld |. 
For the proof we need a lemma which is of some independent interest: 
Lemma 14.25. Every contraction is a convex combination of four unitaries. 


Proof Let T € Y(H) be a contraction. The two operators S$; := 3(T+7*) and Sz := 
A(T — T*) are selfadjoint and satisfy T = Sj + iS . The four operators U; := Sj +i(/ 
S22 are unitary and satisfy S; = (Ut +U;). 


Proof of Theorem 14.24 If U is a unitary operator, the operator TU is a trace class 
operator and ||TU || (4) < ||T||_¢,~4) by Theorem 14.20. For contractions U, the same 
conclusion follows by combining the unitary case with Lemma 14.25 and the general 
case follows by scaling. The proof can now be finished as in Proposition 14.7. 


Proposition 14.26. A bounded operator T is a trace class operator if and only if T* 


is a trace class operator, and in this case we have tr(T*) = tr(T) and ||T*|| (4) = 


IT Aca: 


Proof This is immediate from Theorem 14.20. 


Proposition 14.27. If T is a trace class operator and S is bounded, then ST and TS are 
trace class operators and 


tr(ST) = tr(TS). 
Proof That ST and TS are trace class operators follows from Theorem 14.24. 


To prove the identity tr(ST) = tr(T'S) we first assume that S is unitary. If (Ay)n>1 is 
an orthonormal basis for H, then so is (Shn)n>1. Hence, since the trace is independent 
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of the choice of basis, 


tr(ST) = Y (SThylhn) = Y° (STShy|Shp) = Y (TShn\hn) = te(TS), 


n>1 n>1 n>1 


where we used that S*S = J. The general case follows as in the previous proof by writing 


a contraction S as a convex combination of four unitaries. 


We conclude with a proposition describing the relationship between trace class oper- 
ators and Hilbert-Schmidt operators. As a preliminary observation note that the inner 
product of .%(H) can be reinterpreted in terms of the trace: we have the trace duality 


(TB) = t(TH) = e(T TP). 


Proposition 14.28. A bounded operator on H is a trace class operator if and only if it 
is the product of two Hilbert-Schmidt operators. If T = S2S, is such a decomposition, 
then 

IT aay < ISilleupllS2llaa 


Proof ‘If’: If S; and Sz are Hilbert-Schmidt and T := S25Sj, then for all orthonormal 
sequences (gj)';_, and (hj)j_, in H and alln > 1, 


n n 7 n 9\ 1/2 n 9\ 1/2 
¥ lrajle)| < ¥ srezllsserll < (Qo sige?) (ssh?) 
j=l i= j=l j=l 


< Sill gun Sllaqy =WSillacnlS2llacuy 

Letting n — o, by Theorem 14.20, this implies that T is a trace class operator and 
satisfies the inequality ||T || 4, (7) < ||Si|| aca) ||S2ll aca) 

ae if’: Using a polar decomposition T = U|T| with U a partial isometry, take 

= |T|'/? and S, = U|T|!/2. Since |T| is a trace class operator, |T|!/? is a Hilbert- 

Sad operator, and hence so are S; and So. 


14.3 Trace Duality 


We have already noted that the space %(H) of Hilbert-Schmidt operators on H is a 
Hilbert space with respect to the inner product given by trace duality, 


(T)|T2) =t(NT>), T,, hb € Ly(H). 


The next theorem establishes that, by the same formula, %(H) can be identified iso- 
metrically as the dual of .~ (#), the closed subspace of /(H) consisting of all compact 
operators on H, and “(H) as the dual of “4 (H). 
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Theorem 14.29 (Trace duality). By trace duality we have isometric isomorphisms 
(4(H))*~ Z(H) and (Li(H))* ~ LH). 

More precisely, the following results hold: 


(1) for every T € Z(H) the mapping or : H(H) > C given by or(S) := tr(ST) is 
linear and bounded and satisfies 


lor llc) = IIT aca): 


and, conversely, for every 0 € (XH (H))* there exists a unique T € 2 (H) such that 
@ = or; 

(2) for every T € Y(H) the mapping Wr : Z(H) > C given by Wr(S) := tr(ST) is 
linear and bounded and satisfies 


lvl Aan = IIT Ilgen; 
and, conversely, for every W € (2 (H))* there exists a unique T € &(H) such that 
V= Wr. 


The point of working with tr(ST) rather than tr(ST*) is that this makes the identifi- 
cations of the duals into a linear correspondence rather than a conjugate-linear one. 


Proof (1): Linearity of @r is clear and boundedness follows from Theorem 14.24, 
which also gives the upper bound 


lor lly): SIT aay: 


To conclude the proof of (1) it remains to show that every @ € (.%(H))* is of the form 
or for some T € “(T) and that the converse inequality ||T|| 4 4) < ||9r||( a) 
holds; this also gives uniqueness. 

Let @ € (.#(H))* be given. The inclusion mapping i: 2(H) > -#(H) is contin- 
uous, so the same is true for the functional o = 901: A(H) > C. By the Riesz 
representation theorem there exists a unique T € -44(H) such that 


9(S)=9(S)=(SIT) au, S€ Z(H). 
Let T = U|T| be its polar decomposition and let (gn)n>1 be an orthonormal basis 
for R(|T|). Denoting by P, = Lj_; 8; gj; the orthogonal projection onto the span of 
§15-++58n> 
y (IT |gnl8n) = (U*T 8n|8n) = ‘ (U8n|T8n) 
n>1 n>1 n>1 


= lim (P,UP,|T) zu) = lim 6(PxU Pr), 
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where the nonnegativity of the first expression justifies the second equality. Because @ 
is continuous it follows that 


[0 (PaU Pr) = 10 PxUPa)| < [lil Pa Pall] ll < lle, 
which proves that T is a trace class operator with ||T || 4, 4) = tr(|7|) < |||. 
We now show that @ = @7«. For all g,i € H we have 
or+(g@h) = tr((g@h) oT”) = te((|h)T"g) = te((T*g) @h) = (T"g|h) = (g|Th). 


On the other hand, if (4,)n>1 is an orthonormal basis such that h; = h, 


$(g@h) = (g@A\T) = V ((g@h)hn|Thn) = (h\h)(g|Th) = (g|TA). 


n>1 
By linearity, this proves the identity @7+(S) = @(S) for all finite rank operators S. Since 
these are dense in .% (H) by Proposition 7.6, it follows that @ = r+ as claimed. 


(2): Again linearity is clear and boundedness follows from Theorem 14.24, which 
also gives the upper bound || yr|| (4, (#))* < ||7||_zua). The converse inequality follows 
from 


IIT za) = sup |(Txly)|= sup |tr(To(@@y))| 


[-lbll<1 [-lspll<t 
= sup |WrSy)<[l¥rll sup leSyllguy =ll¥rll, 
[-tllblist [-tllblis1 


which also gives uniqueness. 

To conclude the proof of (2) it remains to show that every y € (%(H))* is of the 
form Wr for some (necessarily unique) T € (H). By the Riesz representation theorem, 
for any h € H there is a unique element Th € H such that 


(g|Th) =W(gSh), g eH, 


and the mapping +> Th is linear. From the identity it is immediate that T is bounded, 
with ||7|| < || y||. As in the proof of (1), for all finite rank operators S we have y(S) = 
wr~(S). By Theorem 14.23 the finite rank operators are dense in “(H). Therefore, 


W= Wr. 


14.4 The Partial Trace 


If we think of the trace as the noncommutative analogue of the expectation, the partial 
trace of a trace class operator is then the noncommutative analogue of the conditional 
expectation of a random variable. 

Using the notation of Appendix B we introduce the following definition. 
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Definition 14.30 (Hilbert space tensor product). The Hilbert space tensor product of 
the Hilbert spaces Hj,...,Hy is the completion of the algebraic tensor product H; © 
-- @ Hy with respect to the norm obtained from the inner product 


(Lae. ean |. ee Ey (gh jay? 


With slight abuse of notation the Hilbert space tensor product of H),...,Hy is de- 
noted again by H; ®---@ Hy. We leave it to the reader to check that if H),...,Hy are 
separable, with an orthonormal basis (n" )) j>1 for each A, then the tensors ni! ) @-+-@ 


i form an orthonormal basis for H; ®---@ Ay. 
If (Q1,U1),...,(Qu, Un) are o-finite measure spaces, then the linear mapping from 
L?(Q), M1) @ ++ @L?(Qy, uy) into L?(Qy x «+» x Qu, My X «+» X Wy) defined by 


N 
fi@-:@ fv [(@1,-.-,@y) ad TT flo] 
n=1 


extends uniquely to an isometric isomorphism 
L? (Q), M1) @-- @L? (Qn, uy) © L?(Qy x +++ x Qu, pr X +++ X by). (14.3) 
Now let H and K be Hilbert spaces. If S € &(H), then the operator S@/, defined on 
the algebraic tensor product of H and K by 
(S@1)(A@k) = Sh@k 


and extended by linearity, extends to a bounded operator on the Hilbert space tensor 
product H ® K and 

|S @7I| = [ISI 
We leave the proof of this simple fact as an exercise to the reader; a more general version 
of this result will be proved in Section 15.6.c (see Proposition 15.65). 

Fork € K letU,y: H + H&K be given by 

Uh :=h@®k. 
Its Hilbert space adjoint equals Us (h@ k’) = (Kk|k’)h. 
Theorem 14.31 (Partial trace). Let H and K be separable Hilbert spaces and let T € 
-L£,(H ®K). There exists a unique operator ttx(T) € 2 (H) such that for all S € 2(H) 
we have 


tr(trx(T)S) = tr(T(S@J)). (14.4) 


The mapping T + trx(T) is called the partial trace with respect to K and is obtained 
by tracing out K. 
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Proof We claim that if (vj )n>1 is an orthonormal basis of K, the sum 
te) =U, TU, (14.5) 
n>1 
converges in 4 (H) and its sum has the required properties. 

By Theorem 14.24, each operator Ux TU,, is a trace class operator. Hence by Theo- 
rem 14.20, for each n > 1 there exist orthonormal sequences (g"" )) joi and (A) j>1 of 
H such that 

5, PU oy ey = YL [UF TU 8)” 109) 
pl 
It follows that 
YG, TO nll ec) =X YMURTU, 8) 1A9”) 


n>1 n>1j21 


=F VY Uce?? ava)A” @v,)| <e, 


n>1j21 


where the last step uses that T is a trace class operator and the sequences Ge @Vn) jn>1 
and (a) ® Vn) jn>1 are orthonormal in H & K. 


Next we check the required identity. If (uy) m>1 is an orthonormal basis for H, then 


tr(trx(T)S) =} (UZ TU,,S) = YY (TUy,Sum|Uy, Um) 


n>1 n>lm>1 
a = yi (rT (S@1)(Um® Vn)|Um @ Vn) = tr(T(S@D). 
n>lm>1 


It remains to prove uniqueness. If A is a trace class operator on H such that 


tr(AS) = tr(tre(T)S) 


for all S € (H), then Theorem 14.29 implies that A = trx(T). 


Example 14.32 (Partial trace of a rank one projection). If h € H and k € K have norm 
one and T = (h@k)®(h@k) is the corresponding rank one projection in H @ K, then 
trx(T) is the rank one projection h@h in H: 


trx(T) =A@h. 
Indeed, for all S € &(H) we have 


tr(trg(T)S) = tr(T(S@1)) = ((S@D(h@k)|h@k) 
= (Sh@k|h@k) = (Sh|h)(klk) = (Sh|h) = tr((h@h)S). 


The result now follows from the uniqueness part of Theorem 14.31. 
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In the terminology of the next chapter, the following proposition states that the partial 
trace of a state is again a state. 


Proposition 14.33. Let H and K be separable Hilbert spaces and let T € 4\(H @K). 
Then: 


(1) if T has unit trace, then so has trx(T); 
(2) if T is positive, then so is tr (T). 


Proof Both properties are immediate consequences of the formulas (14.4) and (14.5) 
for the partial trace. Indeed, the first implies that if tr(7) = 1, then for any orthonormal 
basis (Un)n>i of H, 


— -E tr(T ((un@un) @1)) 
= (T((unSun) ® 1) (Uj @ ve) |uj @ ve) 
= (T((Un@un)uj @ Ve) | uj @ ve) 


n>1jk>1 


= (Tuy ® VglUp @ Ve) = te(T) = 1. 


n>1k>1 


This proves (1). Assertion (2) follows from 


(tre (T)h|h) = )° (UR TU,,h|h) = Y° (TU,,h|U,,h) > O. 


n>1 n>1 


14.5 Trace Formulas 


In this final section we illustrate the preceding theory by computing traces in a number 
of interesting situations. 


14.5.a Lidskii’s Theorem 


If T is a linear operator acting on C4, we may select an orthonormal basis with respect 
to which its matrix representation is in Jordan normal form. Using this basis we find 
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that 
d 
tr(T) = Yn, 
n=1 
where A;,...,Aq are the eigenvalues of T repeated according to their algebraic multi- 


plicities; see Example 7.16. 

If T is a normal trace class operator on a Hilbert space H, the spectral theorem for 
compact normal operators allows us to select an orthonormal basis for H consisting 
of eigenvectors in the following way. For each of the eigenspaces corresponding to 
the eigenvalues of T we select an orthonormal basis. These eigenspaces are mutually 
orthogonal and the union of these bases, after a relabelling, is an orthonormal basis 
(4n)n>1 for H. For this basis we have 

tr(T) = » (Thy|hn) = ¥. An; 
n>1 n>1 
where (An)n>1 is the sequence of nonzero eigenvalues of T repeated according to their 
multiplicities; in the last sum we left out the indices corresponding to eigenvalue A,, = 0 
and did another relabelling. 

The following deep result asserts that these formulas for the trace extend to general 

trace class operators: 


Theorem 14.34 (Lidskii). For every trace class operator T we have 


ar) =). Ans 
n>1 
where (An)n>1 is the sequence of nonzero eigenvalues of T repeated according to their 
algebraic multiplicities. 


Here we use the convention tr(7’) = 0 in case there are no nonzero eigenvalues. We 
present the beautiful proof of this theorem due to Simon, which is based on the theory of 
Fredholm determinants. In order to introduce these, we need some notation from multi- 
linear algebra. We refer to Appendix B for the definitions. The n-fold exterior product 
of a vector space V is denoted by A”V. If T is a linear operator on V, then 


A'(T) (v1 A+++ Avp) = Tv A+? AT Vp 


defines a linear operator A”(T) on A”(V). It is the restriction to A"(V) of the n-fold 
tensor product T®"” acting on V®". If S is another linear operator on V, then A”(ST) = 
A"(S)A"(T). 

If H is a Hilbert space and T is bounded on H, then A”(T) is bounded on A"(H), 
which is a Hilbert space in a natural way, and its adjoint equals (A”(T))* = A”(T*). 
From this we infer that |(A"(T))| = A"(|7|). Thus if (u;)j>1 is the singular value se- 
quence of 7, the singular values of A(T) are Uj, --- Uj, with jj <--- < jy. It follows 
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that A”(T) is a trace class operator and 


, 1 1 
|A (T)|2Z,(arn) = ». ae cae y Hj Lin = alt CAC 
I<o<jn "lye Jn21 
(14.6) 


For (n x n) matrices A we have the following identity relating the determinant to 
traces and exterior products, known as MacMahon’s formula: 


n 
det(1+A) = ¥° tr(A‘A), 
k=0 
with the convention that A°(7) = J. A proof is sketched in Problem 14.13. Observ- 
ing that A‘(V) = {0} when k > dim(V), MacMahon’s formula suggests the following 
definition. 


Definition 14.35 (Fredholm determinant). Let T € “(H) be a trace class operator. The 
Fredholm determinant of I + T is defined as 


det(/+T) = ¥ u( tr(A"(T 
neN 


The sum on the right-hand side is absolutely convergent since 


Y lar) < YMA) gan < ae WP Mec ) = exp(|T|lz,u)- 
neN n>1 nen 
(14.7) 


The crucial step in the proof of Lidskii’s theorem is to establish the following identity, 
valid for all trace class operators T and all uw € C: 


det(I+uT) = [](1+pA,). 
n>1 

Here (An) n>1 is the sequence of eigenvalues of T repeated according to algebraic multi- 
plicities. Notice that Proposition 14.21 guarantees the convergence of the infinite prod- 
uct. Once this formula has been obtained, Lidskii’s theorem is immediate by compar- 
ing the linear term of this product with the linear term in the definition det(/+ UT) = 
Ynen H"tr(A"(T)). 

The remainder of this section is devoted to proving Lidskii’s theorem. We fix a sepa- 
rable Hilbert space H and start with some preliminary results. 


Lemma 14.36. Let T be a bounded operator on H such that T = PTP for some or- 
thogonal projection P on H of finite rank v. Viewing PT P as an operator on the finite- 
dimensional Hilbert space R(P), we have 


det(I +7) = det(Igcp) + PTP). 
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Proof The identity T = PTP implies that T is of rank at most v, and therefore we 
have A”(T) =0 forn > v. ForO <n< v we have tr(A"(T)) = tr(A"(PTP)). Applying 
Definition 14.35 twice, 


det(I+T) = y tr(A"(T)) = y tr(A"(PTP)) 
= n=0 


Vv 
¥) tr(A"(PTP)|a(p)) = det (Ia(p) + PTP). 
n=0 


Lemma 14.37. Let T © £(H) be a trace class operator and let (Un)n>1 be its singular 
value sequence, repeated according to multiplicities. Then 


|det(7+7)| < [J] (+n). 


n>1 


Proof It follows from (14.6) that 


|det(Z+7)| < ¥ |er(a"(7)) < ¥ AD) |g 
neN neN 
1 
> ee » Hj," “Hin S II 1+ Hn). 
neN Nise dnZl n>1 


Lemma 14.38. Let T © @(H) be a trace class operator. For all € > 0 there exists a 
constant Cg > 0 such that for all A € C we have 


|det(/ + AT)| < Ceexp(e|A]). 
Proof Using the inequality |1 +t] < exp(|t|), Lemma 14.37 implies, for any N > 1, 
N 
|det(l+AT)| <T] (1+ Alun) < I] 1+ |A|tn) exp( y |A\dn). 
n>1 n=1 n>N+1 


Fix € > 0. If we choose N > | so large that Pysy41 bn < se the desired estimate is 
obtained, with 


Ce = sp TT 1+[Alsin)exp(—5ela). 
AEC n= 


Lemma 14.39. The map T > det(I+T) is continuous from 2 (H) to C. 
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Proof Suppose Tj > T in 2\(H) as j > ©. Fix € > 0 and choose N > 0 so large that 
Lnswi1C"/n! < z€, where C := sup ; ||7j\|_2,(4)- Then, by (14.7), 


| det( +7;) — det(+T)| se a 1(|A" (Tj) —A"(T)|). 


Denoting by P, the orthogonal projection in H®" onto A"(H), we have 
tr(|A"(Tj) — A"(T)|) = te(|Pa(Tj" = T") Pal) 
< tr(|T}" — T°") < nC" 17; — Tl gc 


If we choose N > | so large that also ||T; — T|| acu) < z€( Mn ye" for y SN, 
then | det(I +7) —det(I+T)| < e€ for j>N. 


Lemma 14.40. If S,T € @(H) are trace class operators, then 
det(/ + T) det(J+ S) = det(7+7)U7+5)). 


Proof First assume that T and S are both of finite rank. Let P be a finite rank projection 
in H whose range contains the ranges of T, T*, S, and S* With m being the rank of P, 
from Lemma 14.36 we obtain 
det([+ T) det(I + S) = tr(A”" (PU +T)P))tr(A” (P(U+S)P)) 
= tr(A"(P(U+T)P)A" (P+ S)P)) 
= tr(A”(P(I+T)(+S)P)) = det((I+T)(7+S)). 


Here we used that A’”(R(P)) is one-dimensional, so that the trace is multiplicative on 
this space. This proves the lemma for finite rank operators T and S. By Lemma 14.39, 
the general case now follows by approximation. 


Proposition 14.41. Jf T € Y(H) is a trace class operator, then I+T is invertible if 
and only if detI+T) £0. 


Proof Suppose first that J+ T is invertible and let § := —T(1+T)~!. Then S is a trace 
class operator and an easy computation gives (I+ 7T)(1+S) =1. It follows from Lemma 
14.40 that det(/ + T) det(J +S) = det(Z) = 1, so det(J+T) £0. 

If 7+ T is not invertible, then —1 is an eigenvalue of 7. Denoting the corresponding 
spectral projection by P, then from Lemma 14.40 and the commutation relation TP = 
PT we obtain 


det( + TP) det(I + T (1 — P)) = det(I+ TP +T(1—P)+TPT(I—P)) =det(I +7). 


Denote by v the algebraic multiplicity of —1. By Lemma 14.36 applied to TP, 
det(J + TP) is the determinant of a finite-dimensional noninvertible operator and there- 
fore it equals 0. This proves that det(J+ 7) = det(/+ TP) =0. 


14.5 Trace Formulas 531 


Proposition 14.42. If T © &(H) is a trace class operator with nonzero eigenvalue 
—1/WUo of algebraic multiplicity v, then F(a) = det(I+ UT) has a zero at Uo of multi- 
plicity v. 


Proof Denoting by P the spectral projection associated with —1/Uo, we have 
det(/+ wT) = det(/+ uTP) det(I + uT (I — P)) 


and det(J+ uT(I—P)) #0 by Proposition 14.41. The operator TP vanishes on the 
range of J — P and its restriction to the range of P has spectrum {—1/uUo}. Thus, for 


O0<nKXcv, 
avom= E (£)= (YB) 


I<ji<<jn SV 


and consequently 


n=0 


The next lemma from complex function theory is stated without proof. 


Lemma 14.43. Let F be an entire function whose zeroes z,,22,... (counting multiplic- 
ities) satisfy Yn>11/\zZn| <e. Assume furthermore that F (0) = 1 and that for all € > 0 
there exists a constant Cg > 0 such that |F(z)| < Ceexp(e|z|). Then 


r@=[I(1-=). zeC. 


Theorem 14.44. If T € Y(H) is a trace class operator, with eigenvalue sequence 
(An)n>1 repeated according to algebraic multiplicities, then for all u € C we have 


det(I+uT) =[](1+pA,). 

n>1 
Proof By Propositions 14.41 and 14.42, the zeroes of F(u) := det(J+ 7), count- 
ing multiplicities, are precisely the points —1/A,. Proposition 14.21 and Lemma 14.38 
show that the assumptions of Lemma 14.43 hold for this function. The result now fol- 
lows from the lemma. 


Proof of Theorem 14.34 The linear term in the Taylor expansion of det(/+p7T) = 
Ynen tr(A"(T)) equals tr(A!(7)) = tr(7’). On the other hand, by Theorem 14.44, this 
term equals }°,5) An. 
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14.5.b Trace Formula for Integral Operators 
The trace of an integral operator with continuous kernel can be computed as follows. 


Theorem 14.45 (Mercer). Let u be a finite Borel measure on a compact metric space 
K. Let T be an integral operator on L?(K,) of the form 


f(x) = | k(xy) FO) daly) 
with continuous kernel k € C(K x K). Then: 
(1) if T is a trace class operator, then its trace is given by 
WT) = | Koxy)au(y) 
(2) if T is positive, that is, if (T f\|f) > 0 for all f € L?(K,|), then T is a trace class 
operator. 


By an argument similar to that employed in the proof below, one sees that T is positive 
if and only if the kernel k is positive definite in the sense that for all finite choices of 
ti,...,ty € S and z,...,zy € C we have 


N 
3 K(tns tn )ZmZn > 0. 


n,m=1 


Proof It has been observed in Remark 2.31 that L?(K, 1) is separable. 


(1): Suppose that the integral operator T is a trace class operator. By Proposition 
14.28 we have T = SS; with S,,S2 Hilbert-Schmidt on L?(K,u). Accordingly, by 
Theorem 14.8 there exist kj,k2 € L?(K x K,u x 1) such that for u-almost all s € K we 
have 


f(s) = [| ka(o.thku(t.u) flu) duu) dus) 
As aresult, for LW x t-almost all (s,t) € K x K we have 
K(st) = ff ko(s.e)ki (tu) f(a) duu). 
By Theorem 14.29, 
tr(T) = tr(S251) = (51153) a 12(K)) 
2 (ki Ika) 22(KxKywxp) 
=f | kuls.t)ka(e,s) dats) du(s) =f. k(s,s) aus) 


where (*) follows from the fact, which follows from Example 14.2 and Theorem 14.8, 
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that the correspondence between Hilbert—-Schmidt operators and their square integrable 
kernels is unitary. 


(2): By the result of Example 7.7, T is compact, and the positivity of T implies that 
its singular value sequence equals its sequence of nonzero eigenvalues (A, )n>1, taking 
into account multiplicities. The rest of the proof is accomplished in two steps. 


Step 1 —Let (/n)n>1 be an orthonormal basis of eigenvectors in L”(K, 1) correspond- 
ing to the sequence (A,),>1. The uniform continuity of k implies that T maps L?(K) 
into C(K) and therefore Th, = A,hy implies h, € C(K) for all n > 1. As a consequence, 
for each n > | the kernel 


n 
kn(s,t) : “2 s,t eK, 


is continuous. 
Let f € L?(K,w). Then 


(Tf) = LY (AnC Flin) hin|(flFim)Fim) = YO An| (fen) P 


n>lm>1 n 


W 


and therefore 


[. [bolo F aut) au(s) 
=(TfIf)- Lf, [SOHO Tom H(t) dH) 


=P Anl(f lin)? — Yairi? 20 
= 


n>1 


In particular, for any Borel sets B of positive -measure, 


i i 
EERO) kn(s,t)1a(t)1a(s) du(t) du(s) > 0. (14.8) 
cata Ay fe o(Anls) au (0) au) 
By a limiting argument (applying (14.8) to a sequence of balls B(t;r,) centred at a 


given point ¢ € supp() with radii r, | 0), from this inequality and the continuity of k, 
we obtain k,(t,t) > 0 for p-almost all t € K for all n > 1 andt € K. Then, 


0< f ka(t.t) duce) 
= fe) du(t) — Ym fit [2 ay (t) = fet, t)du(t)— YA. 
Letting n > 0 we obtain that T € Y(H) and 


I7lqon="2)= Das < [ Henan. 
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In the positive case, the trace formula can be alternatively proved by the following 
more elementary argument. For m = 1,2,... let (Kn (7) Nm be a partition of K of mesh 


less than 1/m. For 1 <n < Ny let ni) » Nas =1 Kin VV w(K" )) (here, and in what follows, 


we discard those indices for which wu (Ki N= = 0 without expressing this in our notation 
in order not to overburden it). This sequence is orthonormal in L7(K,). Using the 
uniform continuity of k we obtain 


sim, (Thy eg) bcs one dak x,y) du (x) du(y) 
y 
oe seen fem J k(y,y) du (x) u(y) 


Nin 
= Him Ye i. ww Kors aulo) = f &Os9) dH) 


Hence, by Theorem 14.20, 


Nn 
J 40,9) dH) = Jim YE Crm UN) < (7), 
n=1 


14.5.c Trace Formula for Fredholm Operators 


The following theorem gives a formula for the index of a Fredholm operator in terms of 
traces. 


Theorem 14.46 (Fedosov). Let T € &(H) be a Fredholm operator and let S € 2(H) 
be an operator such that both I— ST and I—TS are finite rank operators. Then the 
commutator [7,S] = TS — ST is a trace class operator and 
tr([7,S]) = ind(T). 
By Atkinson’s theorem (Theorem 7.23), operators S with the stated properties always 


exist. 


Proof The operator [T,S] = (I — ST) — (I— TS) is of finite rank and hence a trace 
class operator. If S’ € Y(H) is another operator such that J — S’T and J — T'S’ are of 
finite rank, then R := S’ — S is of finite rank and 

tr(TS’ — S'T) = tr(TS — ST +TR—RT) =te(TS —ST) + tr(TR—RT) =tr(TS—ST), 


using that R is a trace class operator and therefore tr(7’R) = tr(RT ) by Proposition 14.27. 
To prove the theorem it therefore suffices to prove it for the bounded operator S € 
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-£(H) constructed in the proof of Theorem 7.23. This operator enjoys the follow- 
ing properties: (i) /— ST and ]—TS are finite rank projections, and (ii) dimN(T) = 
dim RJ — ST) and codim R(T) = dim RJ — TS). Since the rank of a finite rank projec- 


tion is equal to its trace (by Example 14.18), we have 


ind(T) = dimN(T) —codimR(T) = tr — ST) — tr — TS) = tr(TS— ST). 


14.5.d Trace Formula for Commutators of Toeplitz Operators 


From Section 7.3.d we recall that H (1D) is the vector space of all holomorphic functions 
on D of the form Y,,cn €n2” with Den |Cn|? < 09. Identifying it with the closed subspace 
of L*(T) consisting of all functions whose negative Fourier coefficients vanish, H?(T) 
is the range of the Riesz projection 


P: y f (nen > f(nyen 


neZ neN 


in L?(T), where e,(@) = e”°. This projection discards the terms in the Fourier series 
(f(n))nez of f corresponding to the negative indices n = —1,—2,... 

Given a function @ € L*(T), the Toeplitz operator with symbol @ has been defined 
as the bounded operator Ty on H?(D) given by 


Tyf =P(Of), fe HD). 
It follows from Lemma 7.30 that for all @, y € C(T) the commutator 
[Ty Ty] = To Ty — TyTo 
is compact. For functions @, y € C?(T) we have the following stronger result. 
Theorem 14.47 (Helton-Howe). For all ¢,y € C?(T) the commutator [Tg ,Ty] is a 


trace class operator and 


tr([Z,Ty]) = aq |. 9(0)¥'(0)a9. (14.9) 


Proof The proof is a matter of computation. First, [T.., , T.,,| is a finite rank operator of 
rank at most min{|m]|,|n|}, and therefore by (14.2) it is a trace class operator with 


| [Fey Fen ll 4 e72qo)) < min{ lm, jn}. (14.10) 


Second, for all n,m € Z and j € N we have 7, 7.,,¢) = A" en4m+j with AY” € {0, 1}, 
so that 


tr( [Tens Tem) = VA" — AP") (enem+jlej)- (14.11) 
0 
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Case 1: n+m #0. In that case (14.11) gives 
1 Tw 
= / en(0)e,,(0) dd. 
u 


t T,, Ty, aed = 
r([ n ul) 0 Ani J 


Case 2: n+m= 0. In this case Ape = 0 if j <n and ape" = 1 if j >n, while 
always A, "" = 1, and (14.11) gives 
n [*% 1 i 
(oe a ee -2 | én(O)e-n(8)d0 = = / en(8)e,,(0) dd. 
1 


—1 2ni J 


This completes the proof of (14.9) for @ = e, and yW = e». Since both the left- 
and right-hand side of (14.9) are linear in both @ and y, for @ = Yyjenaneén and y = 
Ynen Ynen We have 


[To,Fwl= } andm[TensTen| (14.12) 
mneN 
and hence, taking traces, 


tr([T, Ty]) = y AnD tt([Te,; Tem |) 


mneN 
aan d 
= Etna = | en(8)e(0)40 = = |” 9(0)y'(0)a0 


provided the sum in (14.12) converges in %,(H?(D)). Keeping in mind (14.10), this 
can be guaranteed if we assume that @ and y are C”, for then |a,| and |b,| are of order 
O(4) as n —> co and 


min{|m|, ||} |n| |m| 
L (1 +m2)(1 +n?) ae > y (1 +m2)(1 +72) ty = (1+m?)(1+n?) 


mneZ meZ neZ neZ meZ 
|n|<|m| |m|<|n| 
aS log(1+ Li 
a a oe 
te, Atk 


14.5.e Trace Formula for the Dirichlet Heat Semigroup 


Let D be a bounded open subset of R¢ satisfying |@D| = 0 and let Spir be the Co- 
semigroup on L?(D) generated by the Dirichlet Laplacian Ap;; associated with D. 


Theorem 14.48 (Trace formula for the Dirichlet heat semigroup). For all t > 0 the 


operator Spjx(t) is a trace class operator on L?(D) with 


lime tr(Spir(t)) (4n)4/2" 
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For the proof of this formula we need the following lemma. 


Lemma 14.49. Let 1 be a Borel measure on (0,°°) whose Laplace transform satisfies 


Lylt) = | e™ du(x) < 
0 
for allt > 0. If for some r > 0 anda € R we have 
jim x" ([0,x]) = 
then 


eee) aed) 


where V(s) = J5x° le dx, s > 0, is the Euler Gamma function. 


Proof Integrating by parts and setting v(x) := u({0,x]), we have 
t" Lu(t) = fre ™ du(x <= aa e v(x) dx 
0 0 
BS Ty ees Y\o" yy 
=?" y *)ay= | Y(t4 (1+) dy. 
[ eveya= ferry (142) Vela 
By assumption we have lim,_,..x "V(x) =a, and therefore, for all y > 0, 


nal 
(hes) YG)ae 


In particular we have C := sup,..9(1+.x)~"v(x) < © and therefore 
e(t4 yr *\"v(2) <Ce(t+y)" 
t t , 


and for 0 <t < 1 we can bound the right-hand side by Ce~(1 + y)". It follows that the 
dominated convergence theorem can be applied to obtain 


limt’ 2 u(t) =a/ ey’ dy=al(1+r). 
to 0 


Proof of Theorem 14.48 As was observed in the course of the proof of Theorem 12.26, 
the resolvent operators R(A,Apir) are compact, and this implies the compactness of 
the inclusion mapping of D(Apj,) into L?(D). By analyticity, for each t > 0 the oper- 
ator Spir(t) maps L’(D) into D(Apir) boundedly, and therefore Spir(t) is compact as a 
bounded operator on L?(D). We can now apply the spectral mapping formula Proposi- 
tion 13.20. Evaluating the trace against an orthonormal basis consisting of eigenvectors 
we conclude that Spj;(t) is a trace class operator and 


tr(Spir(t) =e ei < oo, 
n>1 
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where 0 < Ay < Ay <--+ — © is the enumeration, counting multiplicities, of the eigen- 
values of Apir; the finiteness of this sum is a consequence of Weyl’s theorem (Theo- 
rem 12.29). We now apply Lemma 14.49 to the Borel measure HW = 51 543,,}. Setting 
N(x) := max{n > 1: A, <x}, by Weyl’s theorem we have 
: —d/2 _7 —d/2 _ Od 

dim x" w([0,x]) = him x“ N(x) Qn |D|, 
where @y = 24/?/T(1 + 4d) is the volume of the unit ball in R¢ Lemma 14.49 allows 
us to conclude that 
d |D| 


d = 
3) (4x) 4/2" 


, 
(2m) 


a 4/2 — him 4/2 Ant — finn d/27 04) 
lina /tr(Spie(t)) = lime 4/ Ye aint T(t) 5|DIT (1 


14.5.f Euler’s Identity Revisited 


Consider the Dirichlet Laplacian Api, on LV’(0, 1). As shown in Example 12.23, the 
spectrum of this operator equals 


o(Apir) ={—17n?s n= 1,2,...} 


and consists of the eigenvalues corresponding to the eigenfunctions f,(t) = sin(n7t); as 
a consequence of Lemma 12.25, the spectrum of the inverse operator Api, is given by 
-1 Re. 
0 (Ap;,) = = Pe ( ed 1, 2;Siech 

and it again consists of the eigenvalues. Since —Apir is positive, it follows from Mercer’s 
theorem that 

1 — 
Tr 


oo 1 

y tr(—As!) = | k(t,t)dt, 

n=1 0 

where k is Green’s function for the Poisson problem on the unit interval with Dirichlet 


boundary conditions. From Section 11.2.a we recall that it is given by 


In view of 
1 1 1 
[ xena= (1 —t)tdt = — 
0 0 6 


we recover Euler’s identity 
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14.1 Show that if T € @(H) is a trace class operator and U € (H) is unitary, then 
UTU* is a trace class operator and tr(T) = tr(UTU*). 
14.2 Show that if A € &(H) is a positive trace class operator, then A < tr(A)/, that is, 


tr(A)J —A is positive. 
14.3 Prove the following properties of the partial trace. 
(a) If T € Z(H @K), then 
tr(trx(T)) =tr(T). 
(b) If Ay,A2 € &(H), then (Aj @/)T(A2 @1) € Z{(H @K) and 
tre ((A1 @L)T(A2 @)) = Aj tr (T)A2. 
(c) IfA € 4 (HA) and BE (kK), then T:=A@BE Y{(H@K) and 
tre(T) = tr(B)A. 
14.4 Show that if S =x®x and T = y®y with ||x|| = ||y|] = 1 are two rank one orthog- 
onal projections, then 
||S—T|? =1—|(xly)? = 1-w(S7). 
14.5 Consider a bounded operator T € (H). Show that the following assertions are 


equivalent: 


(1) T is a trace class operator, respectively Hilbert—Schmidt; 
(2) exp(T) —/ is a trace class operator, respectively Hilbert-Schmidt. 


Hint: Compare with Problem 7.10. 
14.6 Prove the two assertions made after Definition 14.30. 
14.7 Let T : L?(0,1) + L®(0, 1) be a bounded operator, and let (An) n>1 be an orthonor- 
mal basis for L7(0, 1). 
(a) Show that for every k > 1 there exists a null set Ny C (0,1) such that for all 
c € K* we have 


k 
| LciPhj(0)] <TH, (0,1) \ Ne. 
j=l 
(b) Deduce from part (a) that 
k 
VITA? < (ITIP, t€ O,1)\ Me. 
j=l 


Let i: L*(0,1) + L?(0, 1) be the inclusion mapping. 
(c) Show that io T is Hilbert-Schmidt on L?(0,1) and ITI A201) < IIT I 
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14.8 Prove that if T ¢ &(H) is selfadjoint and S € “(H) is compact, and if the com- 
mutator [T, S] is a trace class operator, then tr[T, S] = 0. 


Hint: Compute the traces of [7,5 + S*] relative to orthonormal bases which diag- 
onalise S + S* 

14.9 Let S,T € @(H) be selfadjoint trace class operators. Let f : R — R be a convex 
C!-function. 


(a) Show that for all norm one vectors h € H we have 
(f(T )hlh) > f((TA\h)) > f((Sh\h)) + f'((Sh|h))((T — S)hlh). 


Hint: For the first inequality expand h against an orthonormal basis of eigen- 
vectors of T. 
(b) Deduce that 


tr(f(T)) > tr(F(S)) +r(F'(S)(T —S)). 
Hint: Show that if h is an eigenvector of S, then the right-hand side in the 
identity of part (a) equals ((f(S) + f’(S)(T —S))h|h). 

14.10 Prove the following analogue of Proposition 14.21: If T € &(H) is a Hilbert- 
Schmidt operator, with eigenvalue sequence (A,)n>1 repeated according to alge- 
braic multiplicity, then 

Y lal? < IIT Ian: 


n>1 


14.11 Let T be a (d x d) matrix and let (An)4_, and (Un)4_,; be the sequences of 
nonzero eigenvalues and singular values of T, respectively, repeated according 


to algebraic multiplicities. Show that 


d d 
I] |Anl = [Lm 
n= n=1 


Hint: Use the result from Linear Algebra that there exists a unitary (d x d) matrix 
U such that U*TU =: A is upper triangular and has A1,...,Aq on the diagonal, to 
see that det(T) = []4_, An. Apply this to |T|. 

14.12 Let T € Y(H) be compact and let 1 > U2 > --- > O be the sequence of its 
nonzero singular values, repeated according to multiplicities. Show that for all 
n> 1 we have 


where the infima are taken over all subspaces Y of H of dimension n — 1. 
Hint: Use Theorem 9.4. 
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14.13 Complete the following outline of a proof of MacMahon’s formula det(1 +A) = 
Yi-0 tr(A/(A)) for complex (n x n) matrices A. 
(a) Prove the formula for the special case when A is diagonalisable, by showing 
that in this case the formula reduces to the identity 


Taewe z eee 
k=0 


=01<i) <<ig<n 
(b) Complete the proof by showing that the diagonalisable matrices are dense in 
M,(C). 
14.14 Prove the following symmetric analogue of MacMahon’s formula: for complex 
(n x n) matrices A one has 


14.15 Let @, y be smooth functions on the unit circle and let f,g: D + R denote their 
harmonic extensions. Applying Green’s theorem to (f ag de 8), show that Theo- 
rem 14.47 implies the identity 


1 Ofodg ogof 
tr([T», Ty]) = 
v([79,Tyl) 2ni Jp Ox Oy Ox Oy 
14.16 Using Fedosov’s theorem, prove that if T is a Fredholm operator on H and T = 
U|T| is its polar decomposition, then 


dx dy. 


ind(T) = tr((UU* — U*U). 


Hint: Show that J— U*U and J — UU~ are the projections onto the null spaces of 
T and T™ respectively, and that codim R(T) = dimN(7%*). 
14.17 Use Fedosov’s theorem to give an alternative proof of the identity 


ind(T) To) = ind(T) ) +ind(7>) 


for Fredholm operators 7; and 7> acting on H. 


15 
States and Observables 


In this final chapter we apply some of the ideas developed in the preceding chapters to 
set up a functional analytic framework for Quantum Mechanics. More specifically, we 
will show how the replacement of Borel sets in classical mechanics by orthogonal pro- 
jections in a Hilbert space leads, in a natural way, to the quantum mechanical formalism 
for states and observables. 


15.1 States and Observables in Classical Mechanics 


We start by taking a brief look at the notions of state and observable in classical me- 
chanics from a rather abstract measure theoretic point of view. 


15.1.a States 


In classical mechanics, the state space of a physical system is a measurable space 
(X, 2), typically a manifold with its Borel o-algebra. For example, the state space of 
an ensemble of k free moving point particles in R? is R** x R** (three position coordi- 
nates x; and three momentum coordinates p; for each particle) and that of the harmonic 
oscillator (with physical constants normalised to unity) is the submanifold of R x R 
given by x7 + p? = 1. 


Definition 15.1 (States, pure states). Let (X,.2°) be a measurable space. 


(i) A state is a probability measure v on (X, 2’). 
(ii) A pure state is an extreme point of the set of probability measures on (X, 2°). 


For a measurable set B € 2°, the number v(B) is thought of as “the probability that the 
state is described by a point in B”’. 
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Thus we identify the “state” of a system with the ensemble of truth probabilities of 
certain questions about the system. For example, the exact positions and momenta of all 
particles in a gas container at a given time cannot be known with complete precision, 
but one might ask about the probability of finding a certain portion of the gas in a certain 
subset of the container. 

Recall that a measure v on (X, 2°) is said to be atomic if, whenever we have v(B) > 0 
and B = Bo UB, with disjoint Bp, Bi € 2, it follows that either v(Bo) = 0 or v(B,) = 0. 


Proposition 15.2. The pure states are precisely the atomic probability measures. 


Proof This was shown in Example 4.35. 


15.1.b Observables 


Definition 15.3 (Observables). Let (Q,.#) be a measurable space. An Q-valued ob- 
servable is a measurable function f : X — Q. An elementary observable is a {0,1}- 
valued observable. 


For example, the three position coordinates x; and momentum coordinates p; of a 
free moving particle in R? are real-valued observables on the state space X = R? x R?, 
and so are the kinetic energy |p|” /2m (where the mass m is treated as a constant) and 
potentials V(x). 

If v is a state on (X, 2°) and f : X — Q isan observable, then for F € Y the number 


v(f-'(F)) = v({xeX : f(x) € F}) 


belongs to the interval [0,1] and is interpreted as “the probability that measuring f 
results in a value in F when the system is in state v.” 


15.1.c From Classical to Quantum 


An elementary observable is of the form 1g with B € 2’. Its range equals {0,1} unless 
B=2% or B= &, in which case one has 1g = 0 and 1y = 1. Orthogonal projections 
in a complex Hilbert space enjoy similar properties spectrally: if P is an orthogonal 
projection in a Hilbert space H, its spectrum equals o(P) = {0,1} unless P = 0 or 
P =1, in these cases one has o(0) = {0} and o(J) = {1}. 

The basic idea that underlies Quantum Mechanics is to replace elementary observ- 
ables by orthogonal projections. The set of all orthogonal projections in a complex 
Hilbert space H is denoted by Y(H). This set is partially ordered in a natural way by 
declaring P; < P, to mean that the range of P; is contained in the range of P); this is 
equivalent to the statement that the operator P, — P; is positive. With respect to this par- 
tial ordering, A(H) is a lattice in the sense of Definition 2.50; for P; and P, in Y(H) 
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the greatest lower bound 


PAP, 
in A(H) is the orthogonal projection onto R(P;) M R(P2), and the least upper bound 
PVP 


in Y(H) is the orthogonal projection onto the closed subspace spanned by R(P;) and 
R(P2). In addition to these operators, the negation of an orthogonal projection P € 
PH) is the orthogonal projection 


=P=I-P 
onto the orthogonal complement of R(P). One has the associative laws 
(Pi \ P2) A P3 = Pi A\(P2AP3), (Pi V Pr) V P3 = Pi V (P2 V P3) 
and the identities 
A(Pi AP) =3PiVaP2, 7(Pi V Px) = AP, AP). 
The important difference with the classical setting is that the distributive laws 


PLA (P2 V P3) = (Pi A Py) V (Pi \ P3) 
PLV (P2 A P3) = (Pi V Py) A (Pi V P3) 


generally fail. 


Example 15.4. In C? consider the orthogonal projections P;, P), and P3 onto the first 
and second coordinate axes and the diagonal, respectively. Then P, V P, = J, Pj A P3 = 
P,P; = 0, and 


(Pi V Pr) A P3 = P3, (Pi A P3) V (P2 A P3) = 0. 


15.2 States and Observables in Quantum Mechanics 


From now on H is a separable complex Hilbert space. 


15.2.a States 


Upon replacing indicator functions of measurable sets by orthogonal projections in H, 
one is led to the idea to define a state as a mapping v: Y(H) — [0,1] that satisfies 
v(0) =0, v(/) = 1, and is countably additive in the sense that 


YL v(Pn) = v(P) 


n>1 
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whenever (P,)n>1 is a (finite or infinite) sequence of pairwise disjoint orthogonal pro- 
jections and P is their least upper bound, that is, P is the orthogonal projection onto 
the closure of the span of the ranges of P,,, n > 1. Here, two orthogonal projections are 
called disjoint if their ranges are mutually orthogonal, and a family of projections said 
to be disjoint if every two distinct members of this family are disjoint. 

Although this definition is quite satisfactory in many ways, it suffers from the defect 
that it does not present an obvious way to extend v to nonnegative linear combinations of 
pairwise disjoint orthogonal projections. In the classical picture, the expected value of a 
nonnegative simple function f = Y"_, ca1g, in state v is given by its integral Jy fdv = 
v1 cnV(Bn). The desideratum 


N N 
V(¥ cnPa) = b cnv(Fa) (15.1) 
n=1 n=1 


can be thought of as a quantum analogue of this, and constitutes the first step towards 
defining the expected value for more general classes of observables. However, if one 
attempts to take (15.1) as a definition, a problem of well-definedness arises (that such a 
problem indeed may arise is demonstrated by the example at the end of this section). 

The next definition proposes a way around this difficulty. Recall that the convex hull 
of a subset S of a vector space V is the smallest convex set in V containing S and is 
denoted by co(S). 


Definition 15.5 (Affine mappings). Let S be a subset of a vector space V. A mapping 
v:S— [0,1] is called affine if it extends to a mapping v : co(S) — [0, 1] satisfying 


N N 
ee Anvn) = YL Aov(rn 


for all N > 1, v1,...,vw € S, and scalars Ay,...,Ay > 0 satisfying Y*_, A, = 1. 


Let us denote by A™,(H) the set of all finite rank projections in A(H), that is, the 
set of all projections with finite-dimensional ranges. To prepare for the definition of a 
state, we prove the following result. 


Proposition 15.6. Let v: Aan(H) > [0,1] be affine and satisfy v(0) = 0. Then there 
exists a unique positive trace class operator T on H such that 


v(P)=tr(PT), Pe PArn(A). 
It satisfies 


tr(T)= sup v(P). 
PE Pin (A) 


Conversely, if T is a positive trace class operator on H, then 


V(P):=t(PT), Pe A(H), 
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defines an affine mapping v : P(H) = [0,1] satisfying v(0) = 0 and 


sup V(P)= sup v(P)=v()=tr(T). 
PE Pin (H) Pe P(H) 


Moreover, v countably additive. 


Proof To prove uniqueness, suppose that 7,7 € &(H) are such that tr(PT) = tr(PT) 
for all P € Ayn(H). Taking P to be the rank one projection h®h: x+> (a|A)A, with 
h € H of norm one, gives (Th|h) = (Th\h). By scaling, this identity extends to arbitrary 
h € H, and it implies T = T by Proposition 8.1. 

The existence proof proceeds in several steps. 


Step 1 — Throughout this step it is important to keep in mind that, when considering 
general convex or nonnegative-linear combinations of projections P},...,Py in Ain(H), 
the projections P, need not be mutually orthogonal and the same projection may be used 
multiple times. 

Fix orthogonal projections P,,...,Py € Ain(H) and scalars 0 < c1,...,cy < 1 sat- 
isfying aay <1. With cy4) = 1- ya 5 Cn and Py+; := 0, the affinity assumption 
implies 


N N+1 N+1 N 
70) cnPa) = 703 cnPa) = Sy CnV (Pr) = ye CnV (Pr), 
n=1 n=1 n=1 n=1 


where we used that v(0) = 0. Also, if an operator admits two such representations, say 


N N' 
» CnP, = y cP, 
n=1 n=1 
then by the same argument the affinity of v implies that 

N N’ 

oe CnV(Pn) = > cnV(P,)- 
n=1 n=1 

Consider next the case of scalars cj,...,cjy > 0, and consider an operator of the form 


S= ln CnP, with projections P, € Agn(H). Fix an arbitrary integer k > ae Cyn. Then, 
by what we just proved, the number 


1 N Cn N ti NV 
kv(—S) = kv(y Pr) =kY 2v(P,) = ¥ cav(P,) 
k n=1 k n=1 k n=1 
is independent of k. Hence we may define an extension of v, again denoted by v, by 
1 N 
v(S) = kv(7S) = ave). 
n=1 


If S admits two such representations, say S = Y*_, cnP) = oF cP’, then by taking k > 


nn? 
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max{Y_, cn, LN, c,} and using the well-definedness in the case already considered 
we obtain that v(S) is well defined. 

The extension just defined is finitely additive on the set of operators S of the form 
just described. Indeed, this follows by induction from the fact that if S = eae CrP, and 


ul 
SJ = ee 41 CnPn are two such operators, then 


N' N' N N' 
V(S+5') = v(L ont) = Y eav(F) = Y eov (Pa) + a onv(Fa) = V(S) +v(S'). 
n= n= n= n=N+ 


Shifting the index in the expression for S’ is justified since no restrictions are imposed 
on the projections occurring in the expressions for S and S’ other than their membership 
of An (H); cf. the remark at the beginning of the proof. 


Step 2 — Consider now an operator of the form S = aa CnP, with coefficients 
Cn € R and projections P, € Agn(H). Then we may write S = S, — S_, where Sy are 
nonnegative-linear combinations of projections in gn(H) as in Step 1, and define 


v(S) := v(S;) — v(S_). 


To see that this is well defined, let $= S$, —S_ = Si — S'_ be two such representations. 
By the finite additivity proved in Step 1, 


v(S_)+V(S_) =v(S;+S_) = v(S. +S_) = v(S'.) + v(S_), 


so v(S;) — v(S_) = v(S'_) — v(S_) as desired. Similarly it is checked that cv(S) = 
v(cS) for all c € R and that v(S +S’) = v(S) + v(S’). 
IfS= ey, CnP, with coefficients cn € C and projections Pe Yan(H), we set 


1 
2 


v(S) = 58+") -v(i(S—S*)), (15.2) 


Then v is easily seen to be additive and real-linear, and from 


(is) = 5 v(is is*) + + v(i(iS + is") 


it follows that v is in fact complex-linear. 


Step 3—Let S € #(H) be any finite rank operator. We may represent S as Y*_, cnPh 
with c1,...,cy € C and mutually orthogonal projections P,...,Py € Agn(H). In doing 
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so, we obtain 


visi =|v(¥ ot) 


= [Be caver) 


N 
< P, 
S fever len| py V(Pr) 


= IISI- 
(15.3) 


N N 
sma Leal (92 Pr) < axle = | 2 cn? 


Here we used that v( er P,) < 1 since ay P, is an orthogonal projection. 


Step 4 — By the spectral theorem (Theorem 9.1), every compact selfadjoint operator 
S € #(H) can be approximated, in the norm of “(H), by a sequence of finite rank 
operators S,,. The estimate (15.3), applied to their differences, entails that the limit 


v(S) = jim (Sn) 
exists. If the finite rank operators S’, form another approximating sequence, then by 
what has been proved before we have 
[V(Sn) — V(Sp)| = |V Sn = Sp) < [Sn — Spl| > 0. 


This shows it that the number v(S) is independent of the choice of approximating se- 
quence. 
For general compact operators S € &(H) we define v(S) by (15.2) and find 


1 ee 
IV(S)] < SIIS+S"I + 5 |(S—S*)|] < 2I]SI]- 


Repeating previous arguments, this extension is again seen to be linear. 


Step 5 — The argument of Step 4 proves that we may identify v with an element in 
(4 (H))*, the dual of the space .#(H) of compact operators on H. By trace duality 
(Theorem 14.29) there exists a unique trace class operator T € 4 (H) such that for all 
S€ 4 (H) we have v(S) = tr(ST). By considering the orthogonal projection P=h®h 
onto the span of the norm one vector h, we obtain (Th|h) = tr(PT) = v(P) > 0. This 
implies that T is positive. 

If PF, is an increasing sequence of finite rank projections converging to the identity 
operator strongly, then 

tr(T) = lim tr(P,T) = lim v(P,) < sup”: V(P). 


n—-o0o noo Pe Pin (H) 
In the opposite direction, for any P € Arin(H) we have 
v(P) =tr(PT) = tr(TP) < tr(T). 


Taking the supremum over all P € A%in(H) we obtain suppe y,, (1) V(P) < tr(T). This 


550 States and Observables 


proves the identity tr(7) = suppe y,,() V(P). thereby completing the proof of the first 
assertion of the theorem. 


Step 6 — We now turn to the converse statement. Let T be a positive trace class oper- 
ator on H and define v(S) := tr(ST) = tr(TS) for S € @(H). Its restriction to A(H), 
which we shall denote by v again, is obviously affine and satisfies v(0) = 0. To prove 
countable additivity, let (P,)n>1 be a sequence of disjoint orthogonal projections and let 
P be the orthogonal projection onto the closure of the span of their ranges. If (A) ppl 
is an orthonormal basis for the range of P,, then the union of these sequences can be 


relabelled into an orthonormal basis (/x)x>1 for the range of P. Then, 


v(P) =tr(TP) = ¥ (The|te) = (X (rein a) ) = Y w(TP,) = ¥, v(P), 


k>1 n>1 ‘jel n>1 n>1 
the fourth identity being justified by the nonnegativity of the summands. 


Step 7 — Let P € A(H) be arbitrary and choose an orthonormal basis (Mp )n>1 for 
(R(P))+. Denoting the coordinate projections by P,, by the countable additivity of v we 
have v(P) < V(P) +. Yns1 V(Pr) = V(Z). This being true for any P € Y(H) it follows 
that 

sup v(P) < v(J). 
PE P(A) 
On the other hand, if (4,),>1 is an orthonormal basis for H, the countable additivity of 
V gives limy_sco V(L_, Pr) = V(I). Since YN_, P, € Arin(H) this implies 
sup v(P) > vi). 
PE Pan (A) 
In combination with the second part of step Step 5, which shows that for any positive 
trace class operator T on H we have tr(T’) = suppe y,, (4) (PT), this proves the identi- 


ties in the second part of the theorem. 


In what follows we denote by .“(H) the convex set of all positive trace class operators 
with unit trace on H. We will see below (Proposition 15.14) that this set is the closed 
convex hull of its set of extreme points and that these extreme points are precisely the 
orthogonal rank one projections in H. 

A functional @ : 2(H) —> C is called positive if @(T) > 0 for every positive T € 
-£(H), and normal if 

¥ (Pn) = O(P) 

n>1 
whenever (P;)n>1 is a sequence of disjoint orthogonal projections in H and P is their 
least upper bound. The same terminology applies to functionals @ : “(H) — C. 


Theorem 15.7. The following six sets are in one-to-one correspondence: 
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(1) affine mappings v : Piin(H) — [0, 1] satisfying v(0) =0 and supp. y,,(H) V(P) = 1 
(2) affine mappings v : P(H) = [0,1] satisfying v(0) = 0 and v(1) = 1; 
(3) positive trace class operators T on H satisfying tr(T) = 1, via 
v(P)=tr(PT), Pé€ PAan(H); 

(4) positive trace class operators T on H satisfying tr(T ) = 1, via 

v(P)=tr(PT), Pe YAH); 
(5) positive functionals ) : # (H) — C satisfying suppe g,,(H) 9 (P) = 1, via 

g(S)=tr(ST), Se€ #(H); 
(6) positive normal functionals 6: 2(H) > C satisfying ¢ (1) = 1, via 

g(S)=tr(ST), SE L(A). 
Proof For m,n = 1,2,3,4 we write (m)=(n) to express that every object in the set 
described by (m) uniquely defines an element in the set described by (n). 


(1) (3): This one-to-one correspondence is contained in Proposition 15.6. 

(3)=(6): Let T be a positive trace class operator with unit trace and define @ : 
-£(H) > Cas in (6). Then @(/) = tr(T) = 1. To prove the positivity of @, let S > 0. If 
(An)n>1 is an orthonormal basis for H, the positivity of T implies 


$(S) =tr(ST) = r(S'?78/?) = ¥ (TS'7h,|S"/7h,) > 0. 


n>1 
The normality of @ follows from the countable additivity of the mapping P + tr(PT) 
proved in the second part of Proposition 15.6. 
(6)=>(5): This inclusion follows from Step 7 of the proof of Proposition 15.6. 


(5)=(1): The restriction v := | Pin (H) is affine, takes values in [0,1], and satisfies 
v(0) = 0 and suppe »,,(#) V(P) = 1. 

(6)=(2)=(1): The first inclusion is obtained in the same way and the second again 
follows from Step 7 of the proof of Proposition 15.6. 


(1)=(4)=(8): These inclusions are also contained in Proposition 15.6. 


We my now define a state as either one of these six sets. For the sake of definiteness 
we take the sixth: 


Definition 15.8 (States). A state is a positive normal functional @ : @(H) — C satis- 
fying (J) = 1. 
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This definition captures what is generally called a normal state in the mathematical 
literature on Quantum Mechanics; the term state is usually reserved for general positive 
functionals @ : @(H) —> C satisfying @ (J) = 1. The small abuse of terminology com- 
mitted by omitting the adjective ‘normal’ from our terminology may be excused by the 
third item in the above list, which does not involve normality. 


Remark 15.9 (Density functions). In the Physics literature, the positive trace class op- 
erator T with unit trace associated with state y is called the density operator associated 
with @. 


As the following example shows, a countably additive mapping v : A(C”) = [0,1] 
satisfying v(0) = 0 and v(/) = 1 need not be affine (and therefore need not define a 
state). 


Example 15.10 (Failure of affineness in two dimensions). Let H = C? and let S denote 
its unit sphere. Let f : S — [0, 1] be a function with the following two properties: 


(i) f(t) = f(h2) whenever span(h;) = span(hz); 
(ii) f(A1) +f (hz) = 1 whenever hy L hy. 


Apart from these restrictions, f can be completely arbitrary. 
Define v: A(H) = [0,1] by v(0) :=0, v(Z) = 1, and 


V(Pr) = f(h), hES, 


where P;, is the orthogonal projection onto span(/). It is clear that v is countably addi- 
tive: if the orthogonal projections P;,P),... are pairwise disjoint, then all but at most 
two must be zero. If there are zero or one nonzero projections, then countable additivity 
is trivial, and if there are two nonzero projections they must be of the form P;, and P,, 
with h; L ho; in that case countable additivity follows from 


V(Ph,) + V (Ph) = f(t) + f(h2) = 1 = VD) = (Ph, + Phy )- 


If there exists a positive operator T on H with unit trace such that for all P€ A(H) we 
have v(P) = tr(PT), then 


f(A) = V(Pr) = te(PrT) = (TAlh) 


depends continuously on h. It is, however, easy to construct discontinuous functions f 
satisfying the conditions (i) and (ii). Indeed, once the value of f at a given point ho € S is 
fixed, the conditions (i) and (ii) fix the values of f only on the points eho and all points 
orthogonal to them. If we identify S with the unit sphere S* in R*, these points define a 
‘great circle’ incident with ho and an ‘equator’ relative to the ‘north pole’ fo. Therefore, 
in a sufficiently small neighbourhood of ho, f is only determined on a submanifold of 
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dimension 1. This leaves enough room to construct functions f satisfying (1) and (ii) but 
discontinuous at hg. 

If v were affine we could represent it by a positive operator T. This would contradict 
the discontinuity of f. 


It is not a coincidence that this counterexample lives in two dimensions: A celebrated 
theorem due to Gleason asserts that if dim(H) > 3, then every countably additive map- 
ping v: A(H) — (0, 1] is affine and hence defines a state. 


15.2.b Pure States 


Theorem 15.7 establishes four equivalent ways of looking at the convex set of all states. 
Since the correspondences between them preserve convex combinations and hence ex- 
treme points, the following definition makes sense from each of these points of view: 


Definition 15.11 (Pure states). A pure state is an extreme point of the convex set of 
states. 


Proposition 15.12. A state ¢: 2(H) — C is pure if and only if it is a vector state, that 
is, there exists a unit vector h © H such that 


(S) = (Shh), Se L(H). 
This unit vector is unique up to a scalar multiple of modulus one. 


The first assertion can be equivalently stated as saying that the extreme points of 
the set of all positive trace class operators with unit trace are precisely the orthogonal 
projections of rank one. 


Proof ‘Only if’: Let @ be a state and let T be the associated positive trace class 
operator on H with unit trace. By the singular value decomposition (Theorem 14.15) we 
have T=Y,>1 Ann ®h, for some orthonormal basis (A,)n>1 of H and a nonnegative 
scalar sequence (Ay)n>1 such that Y,,, A, = tr(T) = 1. This allows us to write T as 
a convex combination of distinct states unless all but one A, vanish, in which case we 
have T = hy ® hy for some unit vector hy € H and v(P) = tr(Po (hy @hy)) = (Phy |hy) 
for all orthogonal projections P€ A (HA). 


‘If’: If @ is a vector state, then the associated positive trace class operator is of 
the form T = h®h with ||h|| = 1. If T = (1—A)I)+ AT, is a convex combination of 
positive trace class operators Jy and 7, with unit trace, then the unit vector h = Th = 
(1—A)Ioh+AT{h is a convex combination of two vectors of norm at most one. Hence 
we must have either h = (1—A)7oh or h = ATh. Since Tp and T, are contractive, this 
is only possible if A = 0 (in the first case) or A = 1 (in the second case). This means 
that either T = 7p or T = T;, so T is an extreme point of the convex set of positive trace 
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class operator on H with unit trace. Since the correspondence between states and the 
associated positive trace class operators preserves convex combinations, it follows that 
@ is an extreme point of the convex set of states. 

The uniqueness assertion follows by observing that for all @ € R and h € H we have 


(e'°h) & (eh) =h@®h. 


Remark 15.13 (Bras, kets, superpositions, mixed states). In the Physics literature, the 
pure state corresponding to a unit vector 4 € H is commonly denoted by |h) and referred 
to as the ket or wave function associated with h; often, |) is identified with h. 

The addition in H can be used to define, for orthogonal unit vectors h;,h2 € H and 
scalars 0,02 € C satisfying |a|? +|0|? = 1, the pure state 


ay |h1) + O2|h2) — |oehy + Ozh2). 


Such states are referred to as (coherent) superpositions of the states |h,) and |h2). Such 
states should be carefully distinguished from states that can be built by using the ad- 
dition of “1 (H). Indeed, for h1,h2 € H are linearly independent unit vectors in H and 
A € [0, 1] the convex combination (1 — A )hy ® hy + Ahz ® ho, or, in Physics notation, 


(L—A) |i) (ha| +A |ha) (ro| 


defines a state in -4\(H). Such states, which are not pure unless A = 0 or A = 1, are 
called mixed states or, more precisely, mixtures of the states |h,) and |hz). 


We recall that .“(H) denotes the convex set of all positive trace class operators with 
unit trace on H. As we have seen in Theorem 15.7, the elements of this set are in one- 
to-one correspondence with states. By Proposition 15.12, the extreme points of .Y(H) 
are the rank one projections of the form 1®h with h € H of norm one. 


Proposition 15.14. The set ./(H) is the closed convex hull of its extreme points. The 
extreme points of this set are precisely the rank one projections of the form h®h with 
h € FH of norm one. 


Proof By the singular value decomposition of Theorem 14.15, every element of T € 
S (A) is of the form T = ¥.,51 Anhy ® hn, with convergence in trace norm, with (hp )n>1 
an orthonormal basis in H and (A,,),>1 a nonnegative sequence satisfying ),5) An = 1. 
This gives the first assertion. The second follows from Theorem 15.7, which informs 
us that the operators of the form )®h with h € H of norm one are in one-to-one corre- 
spondence with the vector states, which are the extreme points of the convex set of all 
states by Proposition 15.12. 
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15.2.c Observables 


Let (Q,.#) be a measurable space. Classically, an Q-valued observable on the state 
space (X,.2°) is a measurable function f : X — Q. By definition of measurability, f 
induces a mapping from ¥ to 2 given by 


Fofl\(F), FEF, 


and this mapping is countably additive, in the sense that if the sets F,, € F are pairwise 
disjoint, then f~'(Uns1 Fn) = Unsif | (Fn). Identifying sets in 2 by their indicator 
functions and replacing them by orthogonal projections in a Hilbert space H, we arrive 
at the following definition of an observable in Quantum Mechanics. 


Definition 15.15 (Observables). Let (Q,.%) be a measurable space and H a Hilbert 
space. An Q-valued observable is a countably additive mapping P : ¥ + P(A) satis- 
fying P(Q) =I. An elementary observable is a {0, 1}-valued observable. 


By Corollary 9.18, the elementary observables are precisely the orthogonal projec- 
tions. This should be compared to the classical situation where elementary observables 
being precisely the indicator functions of measurable sets. Thus, the basic principle of 
replacing indicators by orthogonal projections can now be rephrased as replacing clas- 
sical elementary observables by their quantum mechanical counterparts. 

Observables defined in this way are sometimes called sharp observables, as opposed 
to unsharp observables which will be introduced in Section 15.3.b. 

Following notation introduced in Chapter 9 we write Pp := P(F) for F € ¥. For 
vectors h € H, we denote by P,, the nonnegative probability measure on © given by 


P,(F) := (Prh|h), FEF. 


In the language of Chapter 9 a real-valued observable is nothing but a projection-valued 
measure on R, and by the spectral theorem (Theorem 10.54) we can associate a unique 
selfadjoint operator A with P determined by 


D(A) = {heH: [ara <=} 
and, for h € D(A), 
(An|n) = [ AarK(A) 


(see Theorem 10.48). In the converse direction, the spectral theorem asserts that every 
selfadjoint operator A arises from a projection-valued measure on R in this way and 
hence defines an observable. 

Thus we arrive at the conclusion that real-valued observables are in one-to-one corre- 
spondence with selfadjoint operators. In most treatments of Quantum Mechanics this is 
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simply taken as a postulate. In a sense, the present treatment provides the deeper motiva- 
tion for this postulate, in that this correspondence appears as a consequence of the point 
of view that, on the mathematical level, the classical-to-quantum transition is simply 
the transition from the Boolean algebra of subsets of measurable space to the lattice of 
orthogonal projections on a Hilbert space. A further advantage of the present approach 
is that, in the same vein, the spectral theorem for normal operators can be reinterpreted 
as establishing a one-to-one correspondence between complex-valued observables and 
normal operators, and between observables with values in the unit circle and unitary 
operators. 

We return to the abstract setting of observables P: ¥ + Y(H) with values in Q. If 
¢ : Z(H) — C is a pure state represented by the unit vector  € H, then we have 


O(Pr) = (Prh|h) =Pi(F), Fe F, 


so the assignment F +> @(Pr) defines a probability measure. The following proposition 
is an immediate consequence of the fact that states are normal. It is the mathematical 
counterpart of the so-called Born rule in Quantum Mechanics and allows us to interpret 
the number (Pr) as “the probability that measuring P results in a value contained in 
F € ¥ when the system is in state o”. 


Proposition 15.16 (Born rule). If @ : @(H) > C is a state and P: ¥ — Y(H) an 
Q-valued observable, the mapping 


Fru 0(Pr), Fe F, 
defines a probability measure on (Q, F¥ ). 


If P is a real-valued (or complex-valued) observable represented by a selfadjoint (or, 
more generally, a normal) operator A, then, as a projection-valued measure, P is sup- 
ported on the spectrum o(A) and therefore P can be thought of as an o(A)-valued 
observable. The physical interpretation is that “with probability one, a measurement of 
A produces a value belonging to o(A)”. 


15.2.d The Uncertainty Principle 


If P is an observable represented by a bounded selfadjoint operator A, the expected value 
of P in state @ is defined as the number 


(A)9 = (A). 
If @ = |h) is a pure state associated with a unit vector h € H contained in D(A), we have 


(A) |n) = (Al). 
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In this situation, for h € D(A) we can define the variance by 
vary) (A) == ((A — (A) \ay)*) yy = ICA — (Alla) All?. 
The uncertainty of A in state |h) is defined by 
Ayn (A) := (varyyy (A)) 7. 


Theorem 15.17 (Uncertainty principle). Let |h) be a pure state associated with the 
unit vector h € H, and consider two real-valued observables with associated selfadjoint 
operators A and B. If h € D({A,B]) := {h € D(A) ND(B): Ah € D(B), Bh € D(A)} and 
(A, B]h := ABh— BAA, then 


Ay (A) A) (B) > 5 |((A.B)AIA)|. 


Proof The operators A := A— (Ah|h) and B := B— (Bh|h) with domains D(A) = D(A) 
and D(B) = D(B) are selfadjoint. In particular we note that Ah € D(B), Bh € D(A), and 
we have [A, B]h = [A, B]h. The Cauchy—Schwarz inequality implies 


Ain (A)A\n) (B) = ||Ah||||Bhl| > |(Ah|Bh)| > | Im(Ah|Bh)| 


1 ~ Lox 1 
= 5|(Ah|Bh) — (Bh|An)| = 5A, B)h|n)| = 5|((A, BIA). 


The physical interpretation of the next result is that a measurement of A in a pure state 
|h) gives the expected value (Ah|/) with probability one if and only if the representing 
unit vector / is an eigenvector of A, and in this case the eigenvalue equals (Ah|/). 


Proposition 15.18. Let P be a real-valued observable, represented by the selfadjoint 
operator A, and let h € D(A) satisfy ||h|| = 1. The following assertions are equivalent: 


(1) A has zero uncertainty in the state \h); 


(2) his an eigenvector for A. 
If these equivalent conditions hold, then for the corresponding eigenvalue A we have 
A = (Ah|h) and (Prayhlh) = 1. 


Proof (1)=>(2): If varj,) (A) = 0, then Ah = (Ah|h)h, so h is an eigenvector of A with 
eigenvalue A = (Ah|h) = (A) in). 
(2)=>(1): Ah = Ah, then 


vary) (A) = ||(A — (AA|h) Al? = ||(A—A) al? = 
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If the equivalent conditions hold, then by Corollary 10.57 for all measurable functions 
f:0(A) > C we have f(A)h = f(A )hA and consequently 


J fae = (fAyalh) = FA). 
o(A) 


This forces P, = 6,,} and therefore |h)(Pra}) = (Pray h|h) = (Alh) = 1. 


15.2.e The Qubit 


It is instructive to take a closer look at the simplest genuinely quantum mechanical 
system, the qubit. It is the quantum version of the bit {0,1}, which we think of as 
equipped with the counting measure giving mass | to each of the two elements of 
{0,1}. Physically, the qubit models a spin—5 particle. We write 1,9, and 1,,; for the 
unit basis vectors of the Hilbert space L?({0,1}) and denote the pure states associated 
with them by |0) and |1). Every pure state is then of the form a |0) +B |1) witha, B €C 
satisfying |a|* + |B|? = 1; here we used the ket notation |) to denote the pure state 
represented by a unit vector h. Since pure states are defined up to a complex number of 
modulus one, every pure state can be uniquely written in the form 


cos (0/2) |0) +e! sin (@/2) |1) (15.4) 


for suitable 0 < 8 < 2 and0 < @ < 27. In spherical coordinates, the variables @ and @ 
uniquely determine a point 


(sin @ cos @, sin sing, cos @) (15.5) 


on the unit sphere S? of R?. This representation of pure states is frequently referred to 
as the Bloch sphere. 

In what follows we identify L7({0, 1}) isomet- 
rically identify with C*. Under this identifica- 
tion, linear operators on L7({0,1}) correspond 
to (2 x 2) matrices with complex coefficients. 
States can be identified with points in the closed 
unit ball of R3 as follows. Any selfadjoint oper- 
ator T = (tis)2 ja on C? with unit trace tr(T) = 
t}) +to2 = 1 is of the form 


= 
r=- ( Pea: Oe =) (15.6) 


2 \ cy +ic2 1-c¢ 


with c1,c2,c3 € R. The vector c = (c1,c¢2,c3) € 
R3 is called the Bloch vector of T. It is easily 11) 

: I The Bloch sphere 
checked that the eigenvalues of T are 5(1 + |c|). (Source: Wikipedia) 
From this we see that T > 0 if and only if |c| < 1. 
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A routine computation shows that the pure 
state |h) = cos (@/2)|0) +e’? sin (6/2) |1) corresponds to the operator 


T= hoh=( (15.7) 


l+cos@ e sind 
e?sin@ 1—cos@ 


with Bloch vector (sin @ cos ¢, sin @ sin@, cos @). Thus the Bloch sphere representation 
of the pure state |) equals the Bloch vector of the associated operator h® h. 
Equation (15.6) can be written as 


il 1 0 Cl 0 1 C2 0 —i C3 1 0 
r=3(4 1) +S (4 nieae 0) +2 (5 =) 


= ~(1+ 10) +202 +303), 


ue Es a OA Pl £0 
BED SG ee Se Ee pice SoS Megs 


are the three Pauli matrices. These matrices are selfadjoint and their spectra equal {+1}. 
Therefore they are associated with +1 valued observables, also denoted by 0), 02, and 
03. The corresponding eigenstates of 0; are called the spin up/spin down states along 
the jth axis. Every selfadjoint operator A on C? is of the form 


a c—id cote; cy —ic2 
A = =col+cio oO oy 
ee b ) Cae eo) COE NOV a2 O24 E05 


where 


for certain a,b,c,d,cg,c1,C2,c3 € R with a=co+c3, b=cy—c3, C= cy, andd =cp. 
It follows that the quadruple {7,0 , 02,03} is a basis for the real-linear vector space of 
selfadjoint operators on C”, 


15.2.f Entanglement 


The natural choice for the state space of a system of N classical point particles in R? is 
RY x R32, the idea being that six coordinates are needed (three for position, three for 
momentum) to describe the state of each particle. In Quantum Mechanics, the natural 
choice of Hilbert space is L?(IR*”). Labelling the points of R*% as x = (x) 3% this 


j /jn=P 
choice suggests the following natural definition of observables a describing the jth 


coordinate of the nth particle: 


B” F(x) :=af@), fed@”), xeR%, 


n 
J 
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where D(x”) ={feL?(R*%) : xf € L’(R°)}. Later we will see that the correspond- 
ing momentum operators are given by 
1 0 
Sate FEDp)) 
LOX; 
with their natural domains. 
The space L7(IR>”) is isometric in a natural way to the N-fold Hilbert space tensor 
product (see Definition 14.30 and the discussion following it): 
L?(R*%) ~ L’(R?)@---@L’(R?). 
a 


N times 


This suggests that if the Hilbert spaces H),...,Hy describe the states of N quantum 
mechanical systems, then their Hilbert space tensor product 


H,®---@Hyn 


serves to describe the system composed of these N subsystems. In what follows we 
focus on the case N = 2, but everything we say extends to general N without difficulty. 

Let H and K be Hilbert spaces, and let H ® K be their Hilbert space tensor product. 
For unit vectors h € H and k € K we write |h) and |k) for the pure states in H and K 
represented by these vectors, and 


|) |k) == |A@k) 


for the pure state represented by the unit vectorh@kinH@K. 

Suppose now that orthonormal vectors h;,h2 € H and orthonormal vectors k,k2 € K 
are given. Then the unit vectors hj @ k; and hz © kz are orthogonal in H & K. Hence, for 
scalars 0, Q € C satisfying ||? +|a2|7 = 1, the superposition ah) @ ky + Ogh2 ® ko 
defines a unit vector in H @ K. To this unit vector corresponds the pure state 


1 |/1)|k1) + O2|h2)|k2) = lO hy @ ky + Oghz @ ko). 


Unless a = 0 or Q = 0, such states cannot be written in the form |/)|k) and are called 
entangled states. 

The partial trace (see Section 14.4) can be used to define states of subsystems starting 
from the state of a composite system. More concretely, suppose that T € “4\(H @ K) 
is a positive trace class operator with unit trace. Then the operators trx(7) and try(T) 
are positive trace class operators with unit trace in %(H) and “4 (K), respectively. If 
we think of T as describing the state of a system with Hilbert space H © K, trx(T) and 
try (T) can be thought of as describing the states of the two constituent subsystems with 
Hilbert spaces H and K, respectively. For example, by the result of Example 14.32, if 
the operator T corresponds to a unit vector h®k in H ® K, the states corresponding to 
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trx(T) and try(T) are the pure states |) and |k), that is, 


trx(T) =|A)(h|, tea (T) = |k) (Rl. 
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We discuss next a natural extension of the notion of an observable. 


15.3.a Effects 
As a warm-up we show: 


Proposition 15.19. Let (X, 2°) be a measurable space. The closed convex hull in By(X) 
of the set of elementary observables {1p : B © 2} equals 


E(X):={f € By(X): 0< f <1 pointwise}. 
The extreme points of &(X) are precisely the elementary observables 1p, BE 2. 


Proof Denote by E(X) the closed convex hull of the set of elementary observables. 
The inclusion E(X) C &(X) is trivial. To prove the inclusion é(X) C E(X), let f € 
&(X) be given. Given € > 0, select a simple function g = pee cjlg, such that || f — 
g||-0 < €; this function may be chosen in such a way that the measurable sets B; are 
disjoint and the coefficients satisfy 0 < c; < 1. After relabelling we may assume that 
O<crS--Sy <l. 
If k = 1, then g = (1—c;)1g +c 1p, belongs to E(X). If k > 2 we set 
k 
Ege and) Ll Bp Gate k) 
=i 
and 
Ao :=1l—-ce, Av:=ci1, and Aj:=cj—cji_-1 (i=2,...,k). 


Then 0 <A; < 1, ve Ai = 1, and 


k k 
2 cjlp, = yi Az. 
j=l i=0 


It follows that g belongs to E(X). Since € > 0 was arbitrary, this proves that f € E(X). 

If g € &(X) is an elementary observable and g =Afo+ (1—A)fi withO<A <1 
and 0 < fj; <1 for j =0,1, then 0 = g(€) =A fo(E) + (1—A)fi(€) implies fo() = 
fi(S) =O and I = g(6") =A fol’) + (1-4) fi (G") implies fo($") = fi(6") = 1, thatis, 


fo = fi = g pointwise. It follows that every elementary observable is an extreme point 


562 States and Observables 


of &(X). If g € &(X) is not an elementary observable, then the set {e < g < 1—€} is 
nonempty for sufficiently small € > 0, and aia it is easy to produce measurable fo # fi 
satisfying 0 < f; <1 for j =0,1 and g= 5 L y+ z !'F,. It follows that g is not an extreme 
point of &(X). 


The quantum mechanical counterpart of the elementary observables are the orthog- 
onal projections. In analogy to the above result we now characterise the closed convex 
hull of Y(H) in @(H). We write S < T to express that T — S is a positive operator. 


Proposition 15.20. The closed convex hull in 2(H) of A(H) equals 
6(H):={E€ L2(H): 0< ECT}. 
The extreme points of &(H) are precisely the orthogonal projections. 


Proof Every element of the convex hull of Y(H) belongs to &(#H), and this passes on 
to the closed convex hull. 
Since elements of &(H) are positive and hence selfadjoint, every E € &(H) admits a 


representation as 
E= | AdP(A) 
o(E) 


where P is the projection-valued measure of E. 
Let 
Qn z Qn 
epee 
Sn <S = an by; — mh Lint yp 


= >On 


where J; := (47, 4] for 1 < j < 2” Set 


Qn 


Ev= f Tn(A )dP(A = 5n = LR ((j-1)/2",1]- 


Then £,, is in the convex hull of Y(H) and 


lim ||E — E,|| < im uP Eee fn(A)| =0. 
This proves that £ is in the closed convex hull of Y(#). 

If E € &(H) is an orthogonal projection and E = AE9 + (1—A)E, withO<A <1 
and 0 < E; </ for j =0, 1, then for all x € N(E) we have A (Eox|x) +(1—A)(E1x|x) =0 
with (E;x|x) > 0 for i= 0,1, and this is possible only if (Eox|x) = (E\x|x) = 0. For all 
norm one vectors x € R(E) we have (Ex|x) = (x|x) = 1 and consequently A(Eox|x) + 
(1—A)(Eix|x) = 1. Since (Ejx|x) < 1 for i= 0,1, this is possible only if (Eox|x) = 
(E\x|x) = 1. It follows that Ey = E; = 0 on N(E) and Ep = E; =/ on R(E), and therefore 
Eo = E, = E. It follows that E is an extreme point of &(H). 
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If E € &(A) is not an orthogonal projection, then the spectral theorem for bounded 
selfadjoint operators implies that the spectrum o(E) cannot be equal to {0,1}. Since 
o (E) is contained in [0, 1] it follows that [¢, 1 — €] 1 0(E) is nonempty for all sufficiently 
small € > 0 and then, again by the spectral theorem, it is easy to produce operators 
Eo # E; in &(H) such that E = 5Eo + 5E 1. It follows that E is not an extreme point of 
&(H). 


Definition 15.21 (Effects). An effects is an element of the set &(H). 


Effects are selfadjoint, and it follows from Theorem 8.11 that a selfadjoint operator 
on H is an effect if and only if its spectrum is contained in the unit interval [0, 1]. If T is 
an arbitrary nonzero positive operator, then for all 0 < c < ||T||~! the operator cT is an 
effect. Indeed, this is clear for c = 0, and if c > 0 the operator c—T is positive since 
it is selfadjoint and has positive spectrum. 

A mapping v : &(H) — (0, 1] is said to be finitely additive if 


N 
2X V(En) = V(E) 


whenever £),...,Ey,E € &(H) satisfy Ej +---+ Ey =E. 


Theorem 15.22 (Busch). Every finitely additive mapping v : &(H) — [0,1] satisfying 
v(I) = 1 restricts to an affine mapping v : Y(H) = [0,1] and hence defines a state. 


Proof By assumption we have v(/) = 1 and from 1 = v(/) = v([+0) = v(J) + v(0) = 
1+-v(0) it follows that v(0) = 0. By additivity, the restriction of v to A(H) to [0, 1] is 
affine. 


15.3.b Positive Operator- Valued Measures 


The next definition generalises the notion of a projection-valued measure by replacing 
the role of orthogonal projections by effects. 


Definition 15.23 (Positive operator-valued measures). A positive operator-valued mea- 
sure (POVM) on a measurable space (Q,-¥) is a mapping Q: .¥ — &(H) that assigns 
to every set F € ¥ aneffect Or := O(F) € &(H) with the following properties: 


@) Qo=I, 
(ii) for all x € H the mapping 


Fy (Qrx|x), FEF, 


defines a measure Q, on (Q,.F). 
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The measure defined by (ii) is denoted by Q,. Thus, for all F € ¥ and x € H, by 
definition we have 


(Ox) = 0,(F)= is 1p dQ,. 
Note that 
O,(Q) = (Qox|x) = (x|x) = IIa”. 


This shows that the measures Q, are finite. 
Every projection-valued measure is a POVM. In the converse direction we have the 
following simple result. 


Proposition 15.24. A POVM Q: ¥ — &(H) is a projection-valued measure if and only 
if Or Qp' = Orpp for all FF! € F. 
Proof The ‘only if’ part has already been established in Section 9.2. The ‘if’ part is 


evident from O7 = Qrnr = Qf, which shows that each Q- is a projection. Since QF is 
also positive, it is an orthogonal projection. 


A POVM which is not projection-valued is sometimes called an unsharp observable. 
An example will be discussed in Section 15.3.d. 

We have seen in Proposition 15.16 that if P: .% + PAH) is a projection-valued 
measure, then for every state @ the mapping 


F }% @(Pr), LE, 


is probability measure on (Q,.¥). This sets up an affine mapping from .7(H) to the 
convex set Mj (Q) of probability measures on (Q,.7); we recall that .“(H) denotes 
the convex set of all positive trace class operators with unit trace on H. As we have seen 
in Proposition 15.14, this set is the closed convex hull of its extreme points, which are 
precisely the rank one projections of the form h®h with h € H of norm one. 

Inspection of this argument shows that it extends to POVMs. The following proposi- 
tion shows that in the converse direction, every POVM arises in this way. 


Theorem 15.25 (POVMs as unsharp observables). Let (Q,-¥) be a measurable space. 
If® : S(H) — Mj (Q) is an affine mapping, then there exists a unique POVM Q: F > 
&(H) such that for all T € Y(H) we have 


(®(T))(F) =t(OpT), Fe F. 


Proof The proof consists of two steps. 


Step 1 — We claim that ® extends to a bounded operator from .% (H) into M(Q). The 
proof of this claim is accomplished in three steps. First, we set ®(0) := 0 and, for an 
arbitrary nonzero positive operator T € 4 (H), 


P(T) := ||T ||, P(7/||T|I1), 
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where ||7'||1 = tr(T’) > 0 since T is positive and nonzero. Note that for all c > 0 we have 
@(cT) =c®(T). 

The identity 


S T 
S47 = (Ish +IT)(4 7 +02) ): 


where A = ||S||1/({|Sl]1 + ||T||1), implies that if S,T € 4 (H) are positive, then 


(S47) =6((SI+ ITI) (Aree HU ar) 
= (sh + It) ®(AT (1 No) 
= (Ish +117) (A(T) +(0-2)o(=—)) 
= ishe(—) " I7he(-) = &(S)+ (7), 


where we used the assumption that © is affine. Applying this to aS and bT with a,b > 0 
we find that 


®(aS + bT) = B(aS) + P(bT) = aP(S) + bP(T). 
Next, for an arbitrary selfadjoint T € %(H) write T = T; — T) with T,, 7 positive 


operators in .4(T). Such decompositions always exist; one could take for instance 
Ti = s(T+ |T|) and Tz = T — T;. We then set 


®(T) := &(T;) — &(D). 


To see that this is well defined, suppose that we also have T = T/ — T; with T/, T; positive 
operators in (7). Then Tj + T; = T, + Tj and hence, by what we just proved, 


(®(fi) — ®(%)) - (@(7) - O(H)) = Oi + H)- (H+) =0. 


As in the proof of Theorem 15.7 it is checked that ® is real-linear. 
Finally, for an arbitrary T € .2(T) we set 


P(T) := 9(A) + iP(B), 
where A := 5(T +T*) and B:= x(T —T™*) are the unique selfadjoint operators such 
that T = A+ iB. As in the proof of Theorem 15.7 it is checked that ® is linear. 


Step 2 — We now turn to the proof of the theorem. Using the extension provided 
by Step 1, for every fixed F € ¥ the mapping T +> (®(T))(F) defines a bounded 
functional on .%(H) and therefore by Theorem 14.29 it defines a bounded operator 
Or € £(H) such that 


(®(T))(F)=t(TOr), T EAH). 
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For all norm one vectors h € H we have 
(Orh|h) = tr((h@h) o Or) = (P(ASh))(F) € [0,1], 


which gives the operator inequality 0 < Op < I, that is, we have Or € &(H). 
It is clear that Qo = J, and for every norm one vector h € H the measure 


F > (Qph|h) =®(h@h)(F), FEF, 


is a probability measure. This proves that Q: F ++ Qr isa POVM. 
Uniqueness is clear since tr(TQr) = 0 for all T € “(H) implies Or = 0. 


Remark 15.26. The assumption that ® should be affine is a reasonable one in the light 
of the following argument. Suppose we have two quantum mechanical systems at our 
disposal, represented by the operators 7; and 7) in .7(H) describing their states. We 
use a classical coin to decide which state is going to be observed: if, with probability 
PD, ‘heads’ comes up we observe the system corresponding to 7; otherwise we observe 
the system corresponding to 77. This experiment can be described as observing the state 
corresponding to the convex combination pT; + (1 — p)7». If ® is the observable to be 
measured, we expect the probability distribution of the outcomes, ®(p7; + (1 — p)T), 
to be given by p(T) + (1— p)®(7). 


POVMs admit a bounded functional calculus, but an important difference with the 
bounded functional calculus for projection-valued measures of Theorem 9.8 is that the 
calculus for POVMs fails to be multiplicative (see, however, (15.10) for a partial result 
on multiplicativity). 


Proposition 15.27 (Bounded functional calculus for POVMs). Let Q: ¥ > £(H) be 
a POVM. There exists a unique linear mapping V : By(Q.) > 2(H) satisfying 


Wdr)=Or, FEF, 
and 
IMAI < Iiflleo, Ff € Bo(Q). 
It satisfies 
YA HHA), f € Bo(Q). 
Proof For x,y € H consider the complex measure Q,., defined by 
Oxy(F):= (Qrxly), Fe F. 


That this indeed defines a measure follows by a polarisation argument from the count- 
able additivity of the measures Q,, x € H. For any measurable partition Q = Fi U---UF 
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we have, by the Cauchy—Schwarz inequality applied twice, 


k k k 
Y |Qxy(Fe)| =a (Qr,xly)| < Y (Qr,x1x)'/? (Ox,yly)"/? 
j=l j=l j=l 
k 
=} 0.(F))/0,(F))” 
j=l 


= 0,(Q)"/7Q,(Q)"? = |lxllllyll, 


from which it follows that Q,.) has finite variation |Q,)|(Q) < ||x||||y]]. 
For f € Bp(Q) define 


ar(x,y) = [P40 x,y € H. 


The form a is sesquilinear and bounded and defines a bounded operator ‘Y(f) on H by 
Proposition 9.15. It is clear that ¥(1-) = Op for all F € F and 


eH NadI=| f F405] < f UfldlOrol < IMflellall 


The identity (Y(f))* = ¥(f) is a consequence of Qy. = Qyy, from which it follows 
that 


((W(A))*xb) = (8(F)») = EDR) = 73) 
= | F40.0= f Fay = 07x59) = (YA ab. 


Uniqueness is clear from the fact that ‘¥(1-) = Qf and the simple functions are dense 
in By (Q) . 


15.3.c Naimark’s Theorem 


If J is an isometry from H into another Hilbert space H and Pisan orthogonal projection 
in H, then J* PJ is an effect in H: for all x € H we have 
0-< (PJx|Jx) = ||PJal|? < ||]? = (lx) 

and therefore 0 < J*PJ < I. This gives a method of producing POVMs from projection- 
valued measures: 

Proposition 15.28 (Compression). Let J be an isometry from H into another Hilbert 
space H. If P: ¥ —+ YH) is a projection-valued measure, then Q := J*PJ : # > 
&(H) isa POVM. 
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Proof By what we just observed, Q maps sets F € F to elements of &(H). It is clear 
that Og = J*J =I. To see that Q is a POVM, it remains to observe that for all x € H 
and F € # we have 


O,(F) = (Qrx|x) = (PrJx|Jx) = Pr(F), 


from which it follows that Q, is a finite measure on (Q, F). 


The main result of this section is Naimark’s theorem, which asserts that, conversely, 
every POVM arises in this way. 


Theorem 15.29 (Naimark). Let (Q,.¥) be a measurable space and let Q: ¥ > &(H oe ) 
be a POVM. There exists a Hilbert space H, a projection-valued measure P: # + 
YP (H), and an isometry J : H — H such that 

Or =J*PrJ, FEF 


To motivate the proof of this theorem we consider first the special case Q = T and 
F = ZT) its Borel o-algebra. If Q: A(T)  &(H) is a POVM, the operator 


T:= [ 40) 


is a contraction on H by Proposition 15.27. By the Sz.-Nagy dilation theorem (Theorem 
8.36) there exist a Hilbert space H, a unitary operator U € & (H H), and an isometry 
J :H — H such that 


T’=SU"J, neN. 


Using the spectral theorem for bounded normal operators, let P: A(T) + P(H) be its 
associated projection-valued measure. Then, by the properties of the bounded functional 
calculus of U, 


lr =SUy HS (/ z"AP(2) )J = if z"dQ(z), néN. 
T T 
We claim that P has the desired properties. Indeed, for all x € H we have 
[araaa) = (T"x|x) = (U"Jx|Jx) = i) A" dP7,(A). 
T T 


This means that the nonnegative Fourier coefficients of the probability measures Q, and 
Pj, agree. Hence Q,; = P;, by Theorem 5.32 and the observation following it. But this 
implies, for all Borel subsets B € A(T), 


(Qpx|x) = Qy(B) = Py.(B) = (PaJx|Jx) = (J* Pp Jx|x). 


This being true for all x € H , we conclude that Og = J*PpJ. 
This argument cannot be extended to cover the general case, but it does suggest a 
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proof strategy for Theorem 15.29, namely, to adapt the proof of the Sz.-Nagy dilation 
theorem. 


Proof of Theorem 15.29 Let 
S:=FxH={(F,x): Fe F,xe€H} 
and consider the function Q: S x SC by 
O(p,p') := (Qrnrix|x’) for p= (F,x), p’ = (F'2’). 
We claim that this function is positive definite in the sense that for all finite choices of 


P1,--->Pn € S and z,...,zv € C we have 


N 
¥ O(Pn, Pm)ZnZm 20. (15.8) 
nym=1 


First assume that p, = (Fn,%n) with the sets F;, disjoint. In that case, 


N N N 


y O(Pn; Pm)ZnZm 3 (OFF, Xn|Xm)ZnZm = . (OF, ZnXn\ZnXn) 20 
nym=1 n,m=1 n=1 
by the positivity of the operators Q,,. For general F\,..., Fy € YF we write their union 


UN_, F, as a union of 2% disjoint sets Cg in ¥, indexed by the elements o € 2%, the 
power set of {1,...,N}, as follows. For o € 2% we set 


Ce Fel Be 


neo mgeo 
It is straightforward to check that the sets Cg are pairwise disjoint and that for all n = 
1,...,N we have 
| al Smee a) Oe ee 
oe2N oe2N 
neo {n,m}Co 


Then, by the additivity of Q and the positivity of the operators Qc,, 


N 


N 
y O(Pn, Pm)ZnZm = . (Q.AF»Xn|Xm)ZnZm 


n,m=1 n,m=1 


= x ( Yo Ocorn 


nym=1~ geE2n 
{n,m}Co 


Ss y (OcgXn Xm) inZm 


oe2N l<nym<N 
{nm}Co 


Xm ) ZnZm 
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— > (Qc. iy ZnXn 3 nm) 20. 


oc2N l<n<N l<m<N 
neo meo 


This completes the proof of (15.8). 

Let V be the vector space of finitely supported complex-valued functions defined on 
S. The elements of V are functions h : S — C such that f(p) = 0 for all but at most 
finitely many pairs p = (F,x) € S. The function v € V that maps p € S to the complex 
number z and is identically zero otherwise will be denoted as v = z1). For two functions 
v,v' EV, say v= zu pp and v’ = YN_, Z1pn (allowing some of the z, and /, to be 
zero) we define 

N 


(v|v’) = > O(Pns Pm)ZnZ n+ (15.9) 


n,m=1 


Arguing as in the proof of Theorem 8.34, this uniquely defines a sesquilinear mapping 


from V x V to C which satisfies (v|v’) = (v’|v) for all v,v’ € V and (y|v) > 0 for all 
v €V, and 


N={veV: (v|v) =0} 


is a subspace of V. It follows that (15.9) induces an inner product on the vector space 
quotient V/N. Let H denote the Hilbert space completion H of V /N with respect to this 
inner product. 

Consider elements in H of the form p-+N = (Q,x) +N and p' +N = (Q,x’)+N with 
x,x’ € H. Then 


(p+N|p' +N) a = (Qox|x’) = (x|x’). 


Taking x’ = x, in particular we may identify x € H isometrically with the element p +N 

in H, where p = (Q,x). In this way we obtain an isometric embedding J of H into H. 
To ease up notation we use the notation p = (F,X) for general elements of H, rather 

than the more precise notation p+ N = (F,X) +N. With this notation, Jx = (Q,.x). 
The mapping 7: H > H defined by 


m(F,x) := (Q, Orx) 


satisfies 27(F,x) = 1(Q, Orx) = (Q,QoQrx) = (Q,Orx) = 1(F,x). We extend 2 by 
linearity and check that this results in a selfadjoint, hence orthogonal, projection in H 
whose range equals H. From 


(n(F,x)\x!)g = (rats! )a = ((F,x)|(Q.2")) qq = (FU) a 


it follows that 2, viewed as a mapping from H to H, equals J*. 
Finally set 


Pr(F',x) = (FOF’,x). 
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Again it is routine to check that Pr is an orthogonal projection in H. By the properties (i) 
and (ii) in Definition 15.23 the mapping P : ¥ — (H) is a projection valued measure. 
Finally, from 


J*PpJx = 1Pp(Q,x) = 1(QNF,x) = 1(F,x) = Orx 


we conclude that Or = J*PrJ. 


The argument given after the statement of Theorem 5.32 works for general contrac- 
tions: 


Theorem 15.30. For every contraction T € @(H), there exists a unique POVM Q: 
BT) > &(A) such that 


T = [ 4000), n€N. 


If Q is a POVM with the above property, then T is unitary if and only if Q is a projection- 
valued measure. 


Proof Existence follows follows by following the lines just mentioned: if U is a unitary 
dilation of T and P is its projection-valued measure, the compression Q of P has the 
required properties. 

To prove uniqueness, suppose that for all x € H we have 


(T'x)x) = [ 2°40.(2) = [2"d0.2), EN, 


where Q is another POVM on T. This means that the nonnegative Fourier coefficients 
of the probability measures Q, and Ox agree. Now Theorem 5.32 (and the observation 
following it) can be applied to see that Q, = Oy. 

For the final statement it only remains to prove the ‘only if’ part. But this follows 
from uniqueness, for if T := fpzdO(z) is unitary for some POVM QO on T, then we 
may also represent T in terms of its associated projection-valued measure P, that is, 
T = JpzdP(z). By uniqueness, O=P. 


If Q is as in Theorem 15.30, then for all trigonometric polynomials f € C(T) of the 
form f(z) = D_9 cnz” we have 


WN) = [ f40=s(), 


where T = f--zdQ(z). By the continuity of the bounded functional calculus with respect 
to the supremum norm, this identity persists for functions f in the disc algebra A(D), the 
Banach space of all functions f € C(T) which have continuous extension to D which is 
holomorphic on D; these are precisely the functions belonging to the closure in f € C(T) 
of the trigonometric polynomials of the form just considered. An easy consequence is 
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that the bounded functional calculus of a POVM Q on T is multiplicative on the disc 
algebra, that is, 


P(f)¥(g) =V(fe), fg <A(D). (15.10) 


15.3.d The Phase/Number Pair 


A convenient model for the number operator, the selfadjoint operator in Quantum Op- 
tics which corresponds to the observable of counting the number of photons, can be 
given on the Hardy space H?(ID) considered in Section 7.3.d. Recall that H?(D) is the 
Hilbert space of all holomorphic functions on D of the form Yen nz” with 


y |en|? <0, 


neN 


As we have seen in that section, the mapping 


> CnZ" y Cnen 
neN neN 
with e,(0) := e!”® sets up an isometry from H*(T) onto the closed subspace of L(T) 
consisting of all functions whose negative Fourier coefficients vanish. This allows us to 
identify H?(D) with the range of the Riesz projection Y,<7 Cnén > nen Cnén on L?(T). 
In H?(D) we consider the unbounded selfadjoint operator 7 with domain 


D(A) = {f= Yon" € H2(D): Yen? < coh, 


neN neN 
given by 
nz" =nz, neEN, 


where z” is shorthand for the function z+> z”. Here, and in later sections, we adopt the 
Physics convention to denote concrete observables with a ‘hat’. This notation should 
not be confused with the notation for Fourier transforms. Selfadjointness of 7 is evident 
by noting that, up to unitary equivalence, it is a multiplication operator of the form 
considered in Example 10.60. 

The sequence (z"),,cy is an orthonormal basis of eigenvectors for 7 and accordingly 
we have N C o(7). On the other hand, if A € C\N, then for every f = Yyenenz” in 
H?(D) the equation (A —)u = f is uniquely solved by u = Yen ww @(N). This 
implies that A € p(n). We conclude that 


o(a) =N. (15.11) 


(This is a special case of Proposition 10.32, but the proof could be simplified here be- 
cause we have precise information about the domain of the operator.) We think of the 
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eigenfunctions z” on 7 as the pure states describing the n-photon states of an electromag- 
netic field. In this interpretation, (15.11) tells us that the number of photons observed is 
a nonnegative integer. 

The projection-valued measure N associated with 7 is given by N({n}) = m,, the 
orthogonal projection in H7(D) onto the one-dimensional subspace spanned by z”, so 
that 

(arf) = [ nanpln) =F ny Sif), fe D@). 
neN 


To define phase as a T-valued unsharp observable in the sense of POVMs we proceed 
as follows. Let S be the ‘left shift’ on H?(D), that is, 


S be Qu 3= y Ca412". 
neN neN 


In the language of Section 7.3.d, S is the Toeplitz operator Ty with symbol @(z) = z. 
Identifying H*(D) with the range of the Riesz projection in L?(T), a unitary dilation of 
S is given by the ‘left shift’ V on L?(T), 


V > Cnn = Ly Cn+1€n- 


neZ neZ 


The projection-valued measure P : 4(T) > A(L7(T)) associated with V is easily 
checked to be given by 


Pef =1gf, BE A(T), f€L(T) (15.12) 


Its compression to H?(D) is a POVM ®: T + &(L?(T)), which is called the phase 
observable. It satisfies 


s= [| van, ne€N. 
T 


The covariance property expressed in the following theorem identifies the POVM ©® as 
the “complementary unsharp observable” to the number observable 7. The notions of 
covariance and complementarity will be developed in more detail in Section 15.5. 


Theorem 15.31 (Covariance of phase). Let U be the unitary Co-group generated by in. 
The phase observable ® is covariant with respect to this group, that is, for all Borel 
subsets B C T we have 


U(t)®gU* (t) = Pip, tER, 
where eB = {e''z: z € B} is the rotation of B overt. 


Proof Since the POVM ® is the compression of the projection-valued measure P given 
by (15.12), for all m,n € N we have 


(BgU* (ten lem) = (PaIU* (t)en|Jem) = e™ (1genlem), 
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while at the same time 


(U* (t)® i pen |€m) = (PyirpJen|JU (t)em) =e (Li pen lem) 


Ser Z-™dz—e a tzyn m dz 
e"B B 


=e si ™dz =e" ™ (Ipenlem)- 
B 


Since the functions e,, n € N, have dense span in H* (ID), this completes the proof. 


15.4 Hidden Variables 


The one-to-one correspondence of Theorem 15.25 between the set of POVMs and the 
set of affine mappings from .7(H) to M(Q) is particularly satisfying from a philosoph- 
ical point of view, as it characterises unsharp observables in an operational way: an 
unsharp observable is nothing but a rule of assigning probability distributions to states 
in such a way that convex combinations are respected. The rationale of this assumption 
has been discussed in Remark 15.26. 

Thinking of unsharp observables as affine mappings from .7(H) to Mj (Q), the con- 
vex set of all probability measures on Q, analogously we can define classical unsharp 
observables as affine mappings from M/' (X) to Mj (Q), where (X, 2’) is the state space 
of the classical system. Indeed, in Section 15.2 we have defined observables as measur- 
able functions from X to Q, and such functions f induce an affine mapping from M : (X) 
to M;‘ (Q) by sending 1 to its image measure f (ll) = Lo f |. In this way, every classi- 
cal observable defines a classical unsharp observable. 

The following theorem shows that every family of quantum observables with values 
ina locally compact Hausdorff space admits a classical model, in the sense made precise 
in the formulation of the theorem. 


Theorem 15.32 (Hidden variables). Let Q be a locally compact Hausdorff space and 
suppose that Q;: B(Q) > &(H), i €L, is a family of unsharp quantum mechanical 
observables indexed by an index set I, each of which induces an affine map .(H) to 
Mj (Q) by the prescription 


(Qi(T))(B) = t((Qi)sT), Be AQ). 
Then there exists a compact Hausdorff space X and a family of unsharp classical ob- 
servables f; : Mj (X) + Mj (Q), i € I, each of which induces an affine map Mj (X) to 
M;¥ (Q) by the prescription 

(fi(u))(B) = u(f; '(B)), Be A(X), 


such that the following conditions hold: 
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(1) the elements of X are the equivalence classes of the extreme points of Y (H) modulo 
indiscernibility under Q, that is, two extreme points T;,T: of (H) are said to be 
indiscernible under Q;, i € I, if 


O(T!)=Oi(T), iel; 


(2) the quotient mapping q:T +> [T| from the set of extreme points of Y(H) to X is 
continuous; 

(3) the classical unsharp observables f; are related to the unsharp observables Q; by 
passing to equivalence classes, that is, 


Filer) = Qi(T), 
where 5,17}, € My (X) is the Dirac measure supported on |T] € X. 


The classical observables in f; are hidden variables: by their very definition, they 
cannot be observed using the measurements of the quantum observables Qj. 


Sketch of the proof n what follows we think of the set Extr(.7(H)) of extreme points 
of ./(H) as being equipped with the coarsest topology T such that all mappings T +> 
(g,Q;(T)) with i € I and g € Co(Q) are continuous; the brackets refer between the 
duality of Co(Q) and Mp(Q) (see Theorem 4.2). Adapting the proof of the Banach— 
Alaoglu in a straightforward way, it is seen that Tt turns Extr(.7(H)) into a compact 
Hausdorff space. 

As mentioned in the statement of the theorem, we define 


X := Extr(.7(H))/~, 


where ~ is the equivalence relation of indiscernibility under Q;, i € J. We endow this 
space with the quotient topology v, that is, we declare a subset of X to belong to v if its 
pre-image under the quotient mapping g : T + [T] belongs to t. This topology renders 
the quotient mapping from Extr(.7(H)) to X continuous. As a result, the space X is a 
compact Hausdorff space with respect to v. 

We extend the mapping 6,/7}) ++ Q;(T) to the convex hull in M; (X) of the Dirac 
measures by convexity: 


N N 
wY Andir))) =) A,0i(7) 
n=1 n=1 

for scalars 0 < A, < 1 such that yy An = 1. Clearly, f; preserves convex combinations. 
The functions fj are continuous with respect to the weak* topologies of Mj‘(X) and 
M{ (Q). The span of the Dirac measures is dense in M;'(X) with respect to the weak* 
topology. Leaving aside some topological subtleties, these functions admit unique ex- 
tensions to all of M/ (X), and these extensions again preserve convex combinations. 
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It is of some interest to work through the details of this construction for the qubit. 
Accordingly let H = C. As we have seen in Section 15.2.e, the set of extreme points 
of .7(H) then corresponds to the Bloch sphere S? in R?. Under this correspondence 
the Bloch vector € = (sin@cos@, sin@ sin@, cos@) € S? corresponds to the operator 
Te € S (C7) given in matrix form as 


—id z 
ee e a: 


e?sin@ 1—cos@ 
Let us take the Pauli matrices as the set of observables of interest: 
{0}, 2, 63}. 


Let P; be the projection-valued measure associated with o;. For example, (P;),1} and 
(Pi) 41} are the orthogonal projections onto the one-dimensional subspaces spanned by 


12 30 
Sige 492 _f{ 1/2 -1/2 
(Pay = Gs cae Pty = Gy ta): 


We view these as observables with values in Q := {+1}. Since ® separates the points 
of S*, the space X constructed in the proof of Theorem 15.32 can be identified with $2. 


the eigenvectors corresponding to the eigenvalues 1 and —1 of o) = (| : ; 


The corresponding family classical variables f = {f1, f2, f3} is given by the mappings 
fy: MjS2) > Mf {+1}), 


(fi (de))({1}) = tr((Pi) 1y Te) 
—a((/2 1/2) (1+0080 e sind 
7 1/2 1/2) \e®sin@ 1—cos@ 
1 1+cos@+e%sin@ 1—cos@+e%sin@ 
- l+cos@+e%sin@ 1—cos@ +e? sin@ 


2 5(1 +cos@ +e! sin +1 —cos@ +e" sin) 
=1+cos@sind=1+4+ € 
iadiiikewise 
(fi(dg))({-1}) = 1-1. 
Sitniian Goumputabons ive 
(f2(de))({E1}) =14o2, (f3(5e))({+1}) = 1463. 


By considering convex combinations of Dirac measures and a limiting argument, it 
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follows that the hidden variables f; : Mj'S?) + Mj {+1}) are given by 


1 ; 
af(u) = 5 (I+ [xsdule)) dp, 7= 1.2.3, 


where dp is the probability measure on {+1} giving each point mass 5: 


15.5 Symmetries 


In order to motivate our definition of a symmetry we introduce some notation and ter- 
minology. The adjoint of a conjugate-linear mapping T : H — H (that is, a mapping 
satisfying T(x+y) = T(x) +T(y) and T(cx) = Cx) is the unique conjugate-linear map- 
ping 7* : H — H defined by 


(x|T*y) = (Txly), x,y € H. 


A mapping T : H — H is called antiunitary if it is conjugate-linear and satisfies TT* = 
T*T =1. 

It is straightforward to check that if U : H + H 
is unitary or antiunitary, then the mapping 


UT) :=UTU* 


is well defined as a mapping from .7(H) to 
-S (H) and satisfies 


tr(Y (M1) % (Tr)) = tr(Ti Th), T,,Inb€ S (HA). 
(15.13) 


Here, as before, .“(H) denotes the set of all pos- 
itive trace class operators on H with unit trace. 


As we have seen, the elements of this set are pre- Eugene Wigner, 1902-1995 

cisely the rank one projections h®h with h € H 

of norm one. The physical interpretation of (15.13) is that Y preserves transition prob- 
abilities between pure states. To see this, recall that if A is the selfadjoint operator in H 
representing a real-valued observable and h € H has norm one (and belongs to D(A) in 
case A is unbounded), then the quantity 


tr(A0(h@h)) = (Ahh) 


is interpreted as the expected value of the observable in the pure state |h). In particular, 
if |g) is another pure state, then 


tr((g®g) 0(h@h)) = ((g @g)hh) = |(glA)|? 
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is the expected value of the observable g® g in state |/). In Physics parlance, this is the 

probability of “finding a system with state |/) in state |g) when measuring it against an 

orthonormal basis containing g”, that is, the transition probability between |h) and |g). 
A remarkable theorem due to Wigner provides a converse to (15.13): 


Theorem 15.33 (Wigner). If Y@ : /(H) — -/(H) is a bijection with the property that 
(WY (T1)% (Tr) =t(Mh), Th,h€S7(A), 
there exists a mapping U : H — H which is either unitary or antiunitary such that 
U(T)=UTU*, TE.S(A). 
This mapping is unique up to a complex scalar of modulus one. 


We sketch a proof of the theorem only for the case of the qubit, that is, for H = C2 
and refer to the Notes for some missing details and references to the general case. 


Sketch of the proof of Theorem 15.33 for the qubit We begin by recalling (15.4) and 
(15.7), which state that if we write a unit vector h € C? as cos(0/2)|0) +e? sin(@ /2)|1) 
withO < 6 < mand0< @ < 27, then the rank one projection h®h in C? onto the span 
of h is given as a matrix by 


- l+cos@ e sing 
noh= (es a. 


By elementary computation, 


tr((h@h)( OA) = |(HA')P = (1424-30), 


1 
2 
where, as in (15.5), 
Xn = (sin @ cos @, sin @ sing, cos 8) 

is the Bloch vector of h. Under the bijective correspondence h@h + x; between the 
elements of (C7) and the points of the unit sphere S* of R?, the assumption of the 
theorem implies that Y induces a mapping &@ : S* > S* satisfying el + Bx; Bxy) = 
5(1 +B#xy,:AXp), that is, 


RXp, 2 RXpy =Xpn°Xp- 
This identity implies that the 3 x 3 matrix R defined by 
Rij = Buj- uj, i,j € {1,2,3}, 


with u1,u2,u3 the standard unit vectors of R°, is orthogonal. Now we use the algebraic 
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fact, taken for granted here, that for every orthogonal 3 x 3 matrix R with real coeffi- 
cients there exists a mapping U : C? > C? which is either unitary or antiunitary, and 
which is unique up to a complex scalar of modulus one, such that 


U(x-o)U* =(Rx)-o, x ER’, 


where x- 6 := x10) +207 +303, where x € R? and 6), 02,063 are the Pauli matrices. 
The mapping U has the required properties. 


Informally speaking, Wigner’s theorem tells us that symmetries % of quantum me- 
chanical systems are given by operators U acting on the underlying Hilbert space that 
are either unitary or antiunitary. In practice one is primarily interested in one-parameter 
groups of symmetries indexed by time. Suppose (%(t));er is such a group. By the 
uniqueness part of Wigner’s theorem, for all s,t € R the identity Y (t)Y (s) = Y(t+s) 
implies the existence of a scalar c(t,s) of modulus one such that the corresponding 
(anti)unitary operators satisfy 


U(t)U(s) =c(t,s)U(t+s). 
From the associative law U(t)(U(s)U(r)) = (U(t)U(s))U(r) we obtain the cocycle 
identity 
c(s,r)c(t,s+r) =c(t,s)c(t+5s,r). 


In this situation, a theorem of Bargmann implies that there exists function d, taking 
values in the scalars of modulus one, such that 


__ d(t)d(s) 
OS) eas) 


and the operators V(t) := d(t)~!U(t) are unitary. They satisfy 
(W(t))\(T) =Vty*TVv(t), Vit)V(s) =V(tt+s). 


The unitary group (V(t)),cR can be shown to be strongly continuous. Hence, by Stone’s 
theorem, it follows that there exists a selfadjoint operator #, the Hamiltonian asso- 
ciated with the family 7, such that V(t) = e” for t € R. The action of the unitary 
Co-group (e”),<g on pure states is given by Y (t)(h@h) =V(t)h@V(t)h. The equa- 
tion 


{yah =iV(t) 


dt 
is an abstract version of the Schrédinger equation (13.34) (which corresponds to the 
special case H = L?(R4,m) and # = —A-+ potential). These considerations motivate 


the following definition. 


Definition 15.34 (Symmetry, of a Hilbert space). A symmetry of H is a unitary operator 
on H. 
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15.5.a Covariance 


Definition 15.35 (Symmetry, of a measure space). A symmetry of the measure space 
(Q, .F, WL) is a measurable bijective mapping g : Q — Q with measurable inverse that 
leaves invariant, that is, 


(e(u))(F) = ule (F)) =H(F), FEF. 


If g is a symmetry of (Q,.F,), then for all F € ¥ the set g(F) is measurable and 


L(F) = u(g-!(g(F))) = U(g(F)), that is, pu is also invariant under gh 


Definition 15.36 (Conservation and covariance). Let (Q,.¥, 1) be a measure space and 
let U be a symmetry of a Hilbert space H. 


(i) An observable P : .¥ — Y(H) is said to be conserved under U if UPpU* = Pr 
for all F € F. 

(ii) An observable P : F — A(H) is said to be covariant under the pair (g,U), where 
g is a symmetry of (Q, 7,u), if UPpU* = Py) for all F € F, that is, if the 
following diagram commutes: 


F 
i 
Pr3U Pr U* 


D(H) AEP D(H 


(F) g 


. 
) 


As we will see shortly, position is covariant with respect to translation (an object at 
position x appears at position x — x’ if the origin is translated over x’) and momentum is 
covariant with respect to boosts (an object with momentum p appears with momentum 
p-—p’ if a boost of size p’ is applied, that is, if the origin ‘in momentum space’ is 
translated over p’). 


Locally Compact Abelian Groups An interesting special case arises when observ- 
ables take values in a locally compact abelian (LCA) group G. Our treatment borrows 
some results from the theory of LCA groups that will not be proved here. For the spe- 
cial cases R¢ and T the presentation is self-contained, as all missing details can be filled 
in with the help of the results of Chapter 5. A fuller treatment of symmetries should 
also cover the case of (noncommutative) Lie groups such as SO(3) and SU (2), but this 
would take us too far afield. 

Every LCA group G admits a Haar measure, that is, a Borel measure ft such that 
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u(B) = u(g!(B)) for all B € BG) and g € G. This measure is unique up to a scalar 
multiple. With respect to a Haar measure f1, every g € G induces a symmetry on G by 


gigrogs’, gl eG. 
This induces a symmetry U, on L?(G) := L?(G, 1) given by 
liaf= foe) 
We refer to U, as the translation over g. We have 


U(g1)U(g2)f = (fogy')ogy' = fo (gig2) | =U(gige)f, 


so U(g1)U(g2) = U(gig2). This means that the mapping U : G> Y(L7(G)), g+> Ug, 
is multiplicative and hence defines a unitary representation. It is easily checked that this 
representation is strongly continuous. 

A character of G is a continuous group homomorphism y : G — T. The set I’ of all 
characters of G is called the Pontryagin dual of G. It has the structure of a locally com- 
pact abelian group in a natural way by endowing it with the weak* topology inherited 
from L(G) (local compactness being a consequence of the fact that DU {0} is a weak* 
closed subset of the closed unit ball Bi=(G) which is weak* compact by the Banach— 
Alaoglu theorem). It follows that I carries a Haar measure which is again unique up 
to a normalisation. Every g € G defines a character y+ y(g) on FT, and the Pontryagin 
duality theorem asserts that these are the only ones and the Pontryagin dual of I equals 
G both as a set and as an LCA group. 

Every character y € T induces a symmetry on L7(G) via 


Vf(s') = re’) f(s’), 8 €G, fEL(@. 


We refer to Vy as the boost over y. In the language of Chapter 5, Vy is the pointwise 
multiplier with y. 

It is immediate from the above definitions that the so-called Weyl commutation rela- 
tion holds: 


Proposition 15.37 (Weyl commutation relation). For all g € Gand y €T we have 
VU, = Y(g)UeVy. 
Proof For f € L(G) and g’ € G we have 
VyUef (8) = v(e' Ue f(s’) = V8) f(g '8') 
and 


(g)UeVyF(8') = (8) Vf (87 '8') = (a) V(9"'8')F(g"'8') = 118") f(g 18"). 
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We now turn to the definition of a pair of canonical observables that can be associ- 
ated with LCA groups. For the statement of the second part of the theorem we need the 
Plancherel theorem for LCA groups, which asserts that the Fourier—Plancherel trans- 
form F : L'(G) > L”(I) defined by 


) = [ reyr@aule), vel, 


where it is a Haar measure on G, maps L'(G)L7(G) into L?(I) and there is a unique 
normalisation of the Haar measure of I such that Y extends to a unitary operator from 
L?(G) onto L?(I). For later reference we note that y(g) = y(g~!) and hence 


FUL = [se 's leans) = f eles) dule!) = We) FLU). 
(15.14) 
where the second identity follows by substitution and invariance of . 


Theorem 15.38 (Position and momentum). Let G be an LCA group, let T be its Pon- 
tryagin dual, and let B(G) and BT) be their Borel o-algebras. Then: 


(1) there exists a unique G-valued observable X : B(G) — Y(L?(G)) such that for all 
vy €T we have 


vy = | vax, 
G 


and it is given by Xpf =1pf for f € L?(G) and B € AG); 
(2) there exists a unique V-valued observable = : B(V) + Y(L?(G)) such that for all 


g © Gwe have 
U. = dé, 
8g Ee a 


and it is given by Egf = F—'lpF f for f € L?(V) and BE A(G). 
Proof (1): Consider the G-valued observable X : 4(G) + #(L?(G)) defined by 
Xpf:=1zf, Be AG), fEl(G). 


For Borel sets B C G the operator 7, := fg 1gdX on L(G) is the pointwise multiplier 
T,f = 1g/f. By linearity, for u-simple functions @ on G the operator Ty := [, o dX 
on L?(G) is the pointwise multiplier Ty f = of. By approximation, the operator Ty := 
Jg 4X is the pointwise multiplier T,f = yf = Vyf. This proves existence. 

We only sketch the proof of uniqueness; for the special cases G = R¢ and G = T the 
missing details are easily filled in by using the properties of the Fourier transform proved 
in Chapter 5. If X is an observable satisfying Vy = fg ydX , then for all f € L7(G) we 
have fi, ydXx '¢ = Jy 4X. This can be interpreted as saying that the finite Borel measures 
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Ke f and X+ have the same Fourier transforms. Their equality therefore follows from the 
injectivity of the Fourier transform as a mapping from M(G) to L°(I). 


(2): We begin with the existence part. Applying the construction of the preceding 
part to I’ we obtain the dual position operator X' : A(T) > L(L7(I)) given by 


Xho :=1n6, BEAT), fell). 


By conjugation with the Fourier—Plancherel transform it induces an observable © : 


BV) > L(L?(G)): 
Epf:=F'X}Ff, BEA), fel (GO). 


This observable has the desired property. Uniqueness is proved in the same way as in 
part (1). 


Definition 15.39 (Position and momentum). The G-valued observable X and the T- 
valued observable = of the theorem are called the position and momentum observables 
of G, respectively. 


Proposition 15.40. Let G be an LCA group and let X and © be the position and mo- 
mentum observables of G. 


(1) X is covariant with respect to every (g,Ug) and conserved under every Vy, 
UpXpU; f(g) =Xeaf(s'), VyXeVy f = lef =Xef. 
(2) = is covariant with respect to every (Y,Vy) and conserved under every Ug, 
VEBVyf(Y) =Ewf(Y), UgtaUg f =1ef = Sef. 
Proof (1): For all g € Gand B € &(G) we have 
UXpU} f(g") = [1aU,- f(g‘) = 1a(g | 8") f(8') = Xea f(s’), 
proving covariance with respect to (g, U,). Conservation under V, is even simpler: 


V/XBVy f = YJ)! (FC) = If = Xef. 


The proof of (2) is entirely similar. 


Remark 15.41 (Complementarity). In the Physics literature, the ‘duality’ between the 
position and momentum observables is referred to as complementarity. As we will see in 
the next two sections, this captures the complementarity of the position and momentum 
observables in R@ as well as that of the angle and angular momentum observables in T. 
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15.5.b The Case G = R?: Position and Momentum 


We now specialise to G = R¢ with normalised Lebesgue measure dm(x) = (2) ~4/? dx 
as the Haar measure. Every character y: R¢ > T is of the form 


V(x) = eg (x) = gre 


for some & € R@. Under the identification y @ eg we have [= R¢, the Haar measure 
being the normalised Lebesgue measure m. To distinguish G = R¢ from its dual T = R¢ 
we use roman letters for elements of G and greek letters for elements of its dual 7. 

For every y € R%, the translation y : x +> x+y is a symmetry of R“ The induced 
symmetry Uy on L?(R4,m) is right translation over y: 


Uy f(x) =f(x-y), xe R4 feL’(R4m). 
For every & € R¢ the boost Ve on L?(R4,m) is given by 


Ve f(x) =e f(x), xR’, fEeL(R4m). 
The Weyl commutation relation takes the form 
VeU, =e" FU Ve, x,€ € RA. 


Applying this identity to functions in C!(IR“) and differentiating this relation with re- 
spect to x; and ;, we obtain the Heisenberg commutation relation 


KjSx — Sexy = tO yl, (15.15) 
where 


Rif (x):=xjf), Ef) => (). (15.16) 


The rigorous interpretation of (15.15) is that for all f € C!(R%) the equality ¥ Ge fr- 
EX if = id; f holds in L?(R4,m). Of course (15.15) could also be derived directly from 
the definitions given in (15.16). 

When considered as densely defined operators in L*(R4,m) with domain C!(R“), the 
operators x; and E, are closable and their closures are selfadjoint on L?(R4,m). Indeed, 
by Stone’s theorem (Theorem 13.46), the generators of the unitary Co-groups (Ure ;) 1eR 
and (Vie, rer are of the form iA; and iB, with A;, By selfadjoint. Moreover, Cc! (R*) is 
contained in their domains and we have A; f = x;f and By f = E, f for all f € C!(R¢). 
The space C!(IR@) is dense in L?(IR4,m) and invariant under both groups, and therefore 
by Proposition 13.5 it is dense in both D(A;) and D(B;) with respect to their respective 
graph norms. 

With slight abuse of notation we write x; and E, for the selfadjoint closures and 


denote their domains by D(x;) and D(&). For pure states @ represented by a norm one 
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function f € D(xj)N Dé j) such that x) f, Ej feDa)N D(é;), the uncertainty principle 
of Theorem 15.17 takes the form 
1 


bo (F)A9 (Ei) > 5- 


It follows from Proposition 15.40 that the position observable X is covariant with 
respect to translations and conserved under boosts, and that the momentum observable 
= is conserved under translations and covariant with respect to boosts. This essentially 
characterises these observables: 


Theorem 15.42 (Covariance characterisation of position and momentum). Up to con- 
jugation with a translation, respectively a boost, position and momentum are charac- 
terised by their covariance and conservation properties. More precisely, denoting by X 
and = the position and momentum observables of Definition 15.39 the following asser- 
tions hold. 


(1) if the observable P: A(R‘) + PAL? (R4,m)) is covariant with respect to all pairs 
(x, Ux), x€ R@ and conserved under all boosts Ve, Ee R¢ then there exists a unique 
y € R¢ such that 


Pg =U,XgU*, Be B(R*); 


(2) if the observable P : A(R‘) + Y(L?(R4,m)) is invariant under all translations Uy, 
x € R4 and covariant with respect to all pairs (€, Ve VEE R¢@ then there exists a 
unique N € R¢ such that 


Ps =Vn=pV7, Be A(R‘). 


Proof Let the projection-valued measure P : A(R?) > A(L?(R4,m)) be covariant 
with respect to all pairs (x,U,) and conserved under all boosts Vz. The boost invari- 
ance means that every projection Pg commutes with pointwise multiplication with ev- 
ery trigonometric exponential x > exp(ix- €). By Lemma 5.34, this implies that Px is a 
pointwise multiplier of the form Px f(x) = mg(x) f(x) with mg € L*(R4,m). Since Pg is 
a projection, mg must be an indicator function, say of the set Cg: 


Pg f = lo, f. 


Substituting this into the covariance with respect to translations, we arrive at the identity 
1c, (x—1) = Icy, as elements of L*(R4,m), that is, we have 


Cp+t = Cpt 


up to a null set. Similarly one sees that Cpa = R¢ and Cop = OCz up to null sets. Finally, 
if B and B’ are disjoint, then so are Pg and Pg, and therefore Cg and Czy are disjoint up to 
a null set. It follows that the mapping B++ Cg commutes up to null sets with translations 
and the Boolean set operations. 
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Let B := (0,1) be the half-open unit cube. Suppose x,y € R@ are Lebesgue points 
of 1c, satisfying max;<j<q|xj—y;| > 1. Then Cg and Cg + y — x intersect in a set of 
positive measure. This is only possible if B and B+ y— x intersect in a set of positive 
measure, but these sets are disjoint. This contradiction proves that all Lebesgue points 
x,y of 1c, satisfy max) <j<q |x; —y;| < 1. Since almost every point of 1c, is a Lebesgue 
point, it follows that up to a null set, Cg is contained in the rectangle Tif la pd ils where 


aj; := inf{x;: x is a Lebesgue point of 1c, }, 
bj := sup{x; : x is a Lebesgue point of 1¢, }. 


In particular, since 0 < bj — a; < 1, up to a null set we have Cg C Tf=1laj,4; +1)= 
a+B, where a = (aj,...,aq). 
We next claim that up to a null set we have equality 


Cg=a+B. 


Indeed, the sets K+ B with k € Z4 are pairwise disjoint and their union is R¢. Hence, 
up to null sets, the sets Cyig = k+Cz are disjoint and their union is R¢. This is only 
possible if (a+ B) \ Cg is a null set. This proves the claim. 

Let n € N and consider the set B) := [0,2~”)4. The same argument as above proves 
that there exists an a”) € R@ such that Cain) = a”) +B), Now B is the disjoint union 
of 2”4 translates k +B”, k {0,2” 144, Therefore, up to a null set, Cg = a+ B is the 
disjoint union of the 2” sets on =a") +k+B™, This union equals a) + B. This 


+B(") 
shows that a”) = a for alln EN. 

Summarising what we have proved, we find that for all sets B of the form y + B™) 
with y € R¢@ andn€ Z, we have 


Paf =laref. 
Equivalently, this can be expressed as 
Pp = UgXpU_g = UgXpU7Z. 


This proves the first part of the theorem. To prove the second part, suppose that the 
projection-valued measure P : 4(R“) + A(L*(R4,m)) is conserved under all Uy and 
covariant with respect to all pairs (),V,). Then Ps := #—'PaF defines a projection- 
valued measure that is covariant with respect to all pairs (y, U,) and conserved under all 
V,. It follows from the previous step that P=U,X U* for some a € R4, and therefore 
P= FPF =V,5V;> by (15.14). 


15.5.c The case G = T: Angle and Angular Momentum 


The results of the preceding section have natural analogues for the unit circle T. We 
identify T with the unit circle of C and take the normalised Lebesgue measure on T as 
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Haar measure. Every character y: T — T is of the form 
yz)=2, zéT, 


for some k € Z. Under this identification we have I’ = Z, its normalised Haar measure 
being the counting measure. 

For every w € T the rotation z++ wz is a symmetry of T. The induced symmetry U,, 
on L?(R4,m) is given by 


Uyf(2) =fw'z), zeT, fEeL(1). (15.17) 
For every k € Z the boost V; on L?(T) is given by 
Vif(z) =2F(2), zET, fEL(T). (15.18) 
The Weyl commutation relation takes the form 
VU, = LUM, z2ET, kez. 


The position and momentum observables in T associated with the symmetries U, 
and V; are denoted by © and L and are called the angle and (orbital) angular momentum 
observables. They take values in T and Z respectively; in particular, angular momentum 
can only assume discrete values. In the Physics literature one speaks about ‘quanta’ of 
angular momentum. 


Remark 15.43. By viewing Z as a subset of the real line, we may identify L with a real- 
valued observable and thus associate with L a selfadjoint operator Ton L?(R). There is 
no natural way, however, to do the same with ©. One could identify T with the interval 
(—2, 2] contained in the real line and thus identify © with a real-valued observable. The 
choice of the interval (—7, 2] is somewhat arbitrary, however, and entails a nonunique- 
ness issue that cannot be resolved satisfactorily. The associated selfadjoint operator ) 
appears not to be very useful. For instance, it does not satisfy the ‘continuous variable’ 
Wey! commutation relation 


ois8 Pee ist itl ps8 
This will be further discussed in Problem 15.13. 


By Proposition 15.40, © is covariant under every pair (z,U,) and conserved under 
every V;, and L is conserved under every U, and covariant under every pair (k,V,). 
Repeating the proof of Theorem 15.42 almost verbatim we arrive at the following result. 


Theorem 15.44 (Covariance characterisation of angle and angular momentum). Up to 
conjugation with a translation, respectively a boost, angle and angular momentum are 
characterised by their covariance and conservation properties. More precisely, denot- 
ing by © and L the position and momentum operators associated with the rotations U; 
and boosts V;, given by (15.17) and (15.18), the following assertions hold. 
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(1) if the observable P : B(T) + Y(L?(T)) is covariant with respect to all pairs 
Pp Pp 
(z, U.), z €T, and conserved under all V;, k € Z, then there exists a unique w € T 
such that 


Pg =U,@pU%, Be A(T); 


(2) if the observable P: B(Z) — Y(L?(T)) is conserved under all U;, z € T, and 
covariant with respect to all pairs (k,V,), k € Z, then there exists a unique j € Z 
such that 


Pp = VjLpV}, Be BZ). 


15.5.d The Stone—Von Neumann Theorem 


We have seen in Theorem 15.42 that the R¢-valued position and momentum observables 
are uniquely determined, up to conjugation with a translation and a boost respectively, 
by their transformation properties under translations and boosts. It is interesting to ob- 
serve that both the covariance relation for position 


U,XpUSf =X.wf, xR’, feL(R4m), 
and the covariance relation for momentum 
VeVi f=Ezaf €€R, fel?(R4m), 


imply the Weyl commutation relation. Here, as before, dm(x) = (27) ~4/? dx is the nor- 
malised Lebesgue measure. Indeed, approximating ets by simple functions, as in the 
proof of Theorem 15.38 we find that the covariance relation for position implies the 
identity V:U, = e*SU,Ve for x,€ € R4 which is the Weyl commutation relation. In 
the same way the covariance relation for momentum implies the Weyl commutation re- 
lation. In view of this it is reasonable to ask to what extent position and momentum are 
determined by the Weyl commutation relation. The answer to this question is provided 
by a theorem due to Stone and von Neumann (Theorem 15.48 and its corollary). Proving 
this theorem is the main objective of the present section. 

We start with some preparation. Suppose that U,V: R¢? > Y (H) are strongly contin- 
uous unitary representations of R¢ on a Hilbert space H such that the Weyl commutation 
relation holds, that is, 


V0, =e" FUVe, x, ERA. (15.19) 


The relation (15.19) states that U and V ‘commute up to a multiplicative scalar of mod- 
ulus one’. This suggests to interpret (15.19) as a ‘projective’ unitary representation of 
R¢ x R¢ on H. There is a quick way to extend (15.19) to a unitary representation as 
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follows. Consider the unitary operators 
W(x,€) =e? oOVs =e ET, x,€ ERE (15.20) 
The operators W(x, é) defined by (15.20) satisfy 
W(x, E)W(x,E!) =e HOS OOP. Vey 
= HOE ET, 0,0, 


= phils Ex!) hile )(E+48") 


i & (15.21) 
Ve pe Uri 


= erie SSH (x4 ¥/,€ +6’). 
From this it follows that the unitary operators defined by 
W(x,€,t) =e” W(x, €) (15.22) 
satisfy 
W(x, Et) (0,62!) = OW (x, E)W(W,8') 


- ell ta Seb i (y 4 x E +€') (15.23) 
a W((x,6,) Q (#5658) 


? 


where 


1 
(Et) 0,67) = (« MS Gist ' 5S we), 
One easily checks that the operation o turns H¢ := R¢ x R¢ x R into a group: 


Definition 15.45 (Heisenberg group). The Heisenberg group in dimension d is the 
group H? := R? x R¢ x R with composition law 


(x,E,t)0 (x, E41) = (« tx,E+6' t+" 4 50g x). 


The identity 15.23 informs us that W defines a unitary representation of the Heisen- 
berg group H’? on H. It is strongly continuous and it satisfies 


W(0,0,t)=e"l, teER. (15.24) 


Definition 15.46 (Schrédinger representation). The Schrédinger representation is the 
unitary representation W : Hd — (L?(R4,m)) arising in the special case where the 
unitary representations U,V : R¢ + (L?(R4,m)) are given by translations and boosts, 
respectively. 


Explicitly, the Schrédinger representation is given by 


W(x, Et) f(x’) = ee 25 el 8 f(x’ — x). 
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Proposition 15.47. The Schrddinger representation is irreducible, that is, the only 
closed subspaces of L? (R4, m) invariant under the action of W are the trivial subspaces 


{0} and L?(R4,m). 


The proof of this proposition will be given at the end of the section, for it uses ele- 
ments of the proof of the following theorem which says that the Schrddinger represen- 
tation is essentially the only irreducible unitary representation of H™ satisfying (15.24): 


Theorem 15.48 (Stone-von Neumann). Let W : H¢ + Y%(H) be a strongly continuous 
unitary representation of H4 on a separable Hilbert space H. If W is irreducible and 
satisfies W(0,0,r) =e'l for allt CR, then W is unitarily equivalent to the Schrédinger 
representation W. More precisely, there exists a unitary operator S : LD? (R4, m) > H 
such that 


W(x, &,t) =SW(x,€,1)S*, (x, €,t) ¢ HY. 
The operator S is unique up to a multiplicative scalar of modulus one. 


Here, we use the term unitary operator for an operator S : H + K, where H and K are 
Hilbert spaces, such that S*S = / (the identity operator on H) and SS* = J (the identity 
operator on K). 

We have the following immediate corollary for representations arising from pairs of 
unitary representations satisfying the Weyl commutation relation. 


Corollary 15.49. Let U a -RIn F (H) be strongly continuous unitary representa- 
tions on a separable Hilbert space H satisfying the Weyl commutation relation 


V; Uy = e*5U.Ve, x,€ € RY. 


Suppose furthermore that the family {U,, Ve :xERGEE R¢} acts irreducibly on H 
in the sense that the only closed subspace invariant under all operators U, and Vz, 


xE€R4 E €R¢ are the trivial subspaces {0} and H. Then there exists a unitary operator 
S:L?(R4m) > H such that 


U, = SU,S*, x ERY, 


Ve = SVeS*, E ERY 


where U and V are the translation and boost representations on L? (R4, m), respectively. 
The operator S is unique up to a multiplicative constant of modulus one. 


Proof Defining W : H¢ + Y(H) by (15.20) and (15.22), the irreducibility assumption 
of the corollary translates into the irreducibility of the representation W. 


We now fix a strongly continuous unitary representation W:Hti+ g (H) and define 


U, :=W(x,0,0), Ve :=W(0,€,0), W(x,6) :=W(x,6,0). 
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Then (15.19)-(15.23) hold again. We write m for both the normalised Lebesgue mea- 
sures on R@ and R?¢. 


Definition 15.50 (Weyl transform). For a € L! (R4,m) we define the operator W(a) € 
LH) by 
Wah = [| ,alx,§)W(x,E)hdm(x) dm(6), heH, 
R 
where the integral is a Bochner integral in H. 


The next two lemmas state some properties for the Weyl transform associated with 
the Schrédinger representation W : H¢ + #(L?(R4,m)). 


Lemma 15.51. For all a € L'(IR*4,m) 1 L?(R*4,m) the operator W(a) is Hilbert- 
Schmidt on L? (IR4,m) and 
IW a) ll 222 (R4m)) = Ill 2 (24 m)- 
Proof For the Schrédinger representation We have the explicit formula 
W(x Sf) sete Sfx), feL(Rm), (15.25) 


where W(x, &) := W(x, 6,0) as in (15.22). By a change of variables and Fubini’s theo- 
rem we obtain 


where 
k(x,x’) = feel a byedens dm(&). 
R 


By Plancherel’s theorem, 


[HCE Fame ame) = fl, | [alee )et”$ amt)" am(y) am) 
and hence 


=24 | \a(z.y)Pam(y) dm(z) = 24a 
R2d 
= 2 
ih lk(x,27) 2 dma(x) d(x’) = st fio kf am(y) don(z) = la. 
The result now follows from Example 14.3, which says that an integral operator with 
square integrable kernel is Hilbert-Schmidt, with Hilbert-Schmidt norm equal to the 
L?-norm of the kernel. 
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Since L! (R74,m) NL? (R~4,m) is dense in L? (IR*4,m), the lemma implies that the map- 
ping W : a++ W(a) has a unique extension to an isometry from L? (IR*4,m) into the space 
of Hilbert-Schmidt operators -)(L7(IR4,m)). This extension is again denoted by W. 

A special role is played by the functions 


ao(x,8) = exp(—Z(e? +167), og ERA 
go(x) = 244 exp(—5 hel), xER4 
= 15 


Note that ||o||72 aa 


ym) 


Lemma 15.52. The operator W (ag) equals the rank one projection $9 ® oo. 


Proof Using (15.25) and the elementary identity (which follows from Lemma 5.20) 
—hin-E iw me 2 _ nd/2 os a 2 
Let e exp( lS?) am(é) 2 exp( |u sr) 
we obtain, for f € L?(R4, m), 
2 ae: Dyer) phir gil p(.— 
W(ao)f =f, exp(— zl?) exp(—GIEP)e He f(- x) dm(x) am(E) 
1 1 
=— d/2 == ifee) 2. 2 poner 2 — 
24? fT exp(—I()— 542”) exp(— lal?) £(-—x) dm(a) 
1 1 = 
— 94/2 ea V2 — |x|? = 
24/exp(—5|-1?) [[exp(— 5+?) £6) din(x) = (00% do) 


Returning to a general strongly continuous unitary representation W:H’ og (A), 
we note the following important multiplicativity property. 
Lemma 15.53. For all a,b € L'(R74,m) we have 


W(a)W(b) = W(a#b), 
where the so-called twisted convolution a#b € L' (R74, m) is defined by 
a#b(x,€) := fete eM ae-2,8 = E!)b(x<,E’) dm(x’) dm(E’). 
R 
Young’s inequality implies that a#b does indeed belong to L! (IR4,m). 


Proof Fixh€ H. By (15.21), a change of variables, and Fubini’s theorem, 


Wawoyn=[ a(x, § )b(’, 6')W (x, § )W (x, €")hdn(x) d(G) din(x') dm( 6") 


Ad 
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< W(x +x, +6’ )hdm(x) dn(& ) d(x!) dm(') 

=f edi) a(x— x, — E')0(x',€") 
W(x, Jhdin(x) dn) din(x’) dm (&') 


= ny GHD, EW (x, 8 Jheden(x) den(S ) =W(a#b)h. 


This lemma is used to establish the following technical fact. 
Lemma 15.54. We have 
W (ao) W(x,)W (ao) =ao(x,€)W(ao), x, ERY 


Proof Repeating the steps in the proof of Lemma 15.53, for all h € H we obtain 
W(x E)W(ao)h= [aol E')W (x, E)W(w."phdm(x’) dm(E") 
R 


= hoa FE ag xx, § — 5" )W (x, § )edin(x’) den(§') (15.26) 


= Waye )h, 


where a, ¢ (x’,5") r= epi Sx! Jao(x—x,€ — E’). Hence, by Lemma 15.53, the lemma 
is equivalent to the statement that 


W (an#a,z) = ao(x,)W (ao). 
For this it suffices to show that 
agita,¢ = ao(x, & )ao. 


By the injectivity of the Schrédinger representation W, which follows from Lemma 
15.51, this in turn is equivalent to showing that 


W(ao)W(x,6)W (ao) = W(ao Hay ¢ ) = ao(x,§)W (ao). 
The verification of this identity proceeds by explicit calculation. By Lemma 15.52, 


W (ao)W (x, )W (ao) f = (0 ® 0) W (x, § )(¢0® bo) f 
= (W(x, § )$0| 60) (190) 0 = (W(x, )G0|00)W (ao) f- 


Moreover, by (15.25), Lemma 15.52, and an elementary computation, 
hes : 
(W(x, §)doldo) =e 25 (eS Go(- = 
1 
d/2 eae 5 ad: tino 
=9 ® exp(—Jly—al?) exp(—5 0/2) dm(y) 
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= exp(—Jhx!’) exp(—3IéI’) = ao(x,6). 


We are now ready for the proof of the Stone-von Neumann theorem. 


Proof of Theorem 15.48 We split the proof into three steps. 


Step 1 — We begin by showing that W(ao) is arank one projection. By Lemmas 15.52 
and 15.53 (applied to W), 


W (ao #ao) = W(ao)W (ao) = ($0 ® do)” = $0 ® 0 = W (a0). 


By the injectivity of W (which follows from Lemma 15.51), this implies that aj #a9 = 
ag. Another application of Lemma 15.53, this time to W, gives 


W (ao)W (ao) = W(ao#ao) = W(ao). 
This means that W(ao) is a projection. We will use the assumption of irreducibility of 
W to prove that this projection has rank one. 
We begin by showing that W(ao) # 0. Indeed, if we had W(ao) = 0, then for all 
x,E€ IR@ and hh’ € H we would have, by (15.21) and (15.26), 


0 = (W(x,€)W(ao)W (—x, —€ )hlh’) 
= (W (a,.2)W(—x, —€ alr) 


Lit -E—x-E! 
= Ler Sree aie =e) 


x (W(x,')W(—x, —§ )h|h’) de’ dé! 
a e7 TIA E—¥8) go (WE) 
R2d in 7” 
x (W(x—x,§ — €')W(—x, —€ hh’) dx’ dé’ 


= [tS Pag xl, 6) (—x, Eh!) de gb 


= fee Dana 8 )(W 0, Et!) ax! a8 
This being true for all x,€ € IR%, the Fourier inversion theorem would then imply that 
(W(x,é')h|h’) = 0 for almost all x’,€’ € R“ Since h,h’ € H were arbitrary, it would 
follow that W(x’,&’) = 0 for almost all x’,&’ € 4%, contradicting the fact that all these 
operators are unitary. 

Fix any nonzero h € R(W(ao)); this is possible by the preceding argument. Let ¥,, be 
the closed linear span of the set {W(x,é)h: x,& € R@}. From (15.21) we see that ¥;, is 
invariant under each operator W(x,€) and hence under the representation W. Since W 
is assumed to be irreducible it follows that Y;, =H. 
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By (15.19) and (15.20), 
W(x,é)* = eres UrVE = eG Ve = ety 6, =W(—x,-€). 
This identity implies that W(ao) is selfadjoint, and then we infer from Lemma 15.54 
that if h,h’ € R(W(ao)), say h = W(ao)g and h' = W(ao)g’, then 
(W(x, 6 )h|W(v,8)n') Wi-x, —§')W(x, §)W (ao)g|W (ao)s") 
= S59 (W (xx, § — 8" )W (ao) (a0)s") 
= se SS) a (x—x,€ — §')(W(ao)hln’) 
= oF! E #8) ag(x— x6 —E)(ale’), 


using in the last step that (W(ao)h|h’) = (W (ao)h|W(ao)h') since > W (ao) i is an orthogo- 
nal projection. In particular, if h Lh’ with hh’ € R(W(ag)), then Y;, | Y;,. Since ¥, =H 
this implies Y,, = {0}. In particular W(x, é)h' = 0 for all x, € R4, and therefore h’ = 0 
by the injectivity of W(x,€) (recall that these operators are unitary). This proves that 
R(W(ao)) equals the one-dimensional span of h. 


Step 2 — Define 


N 
sy cnW (isn) do := = y cnW W (Xn, En) ho, 
n=1 


where ho € R(W(ao)) has norm ||/o|| = 1 = ||ol]. It follows from (15.27) (applied to 
both W and W) that S is well defined and isometric on the linear span of the functions 
W(x,&)@o, x,€ € R4% and hence extends to an isometry from Yo, onto Vig the former 
being defined as the closed linear span of the functions W(x,&)@o, x,€ € R4. But Vi = 
H, and by applying this to W we see that likewise Yy, = L? (R4,m). This proves that S is 
isometric from L7(IR¢,m) onto H, and hence unitary. Since Soo = ho, this proves that S 
has the desired properties. 


Step 3 — If T : L?(R4,m) > H is another unitary operator with the property that 
W(x,€,t) =TW(x,€,t)T~! for all (x,€,1) € H4 then S-!TW(x,€) =W(x,€)S“!T for 
all x,€ € R@ From this it follows that S~!'T commutes with W (ag), and therefore it maps 
the one-dimensional range of this operator onto itself. This implies that S~'T f = e’® f 
for some 6 € R and all f € R(W(ao)). Then, 


TY oW (Gee) f=ss- PY eW (n,n) f 


N 
=S z cCnW (Xn, €n)S STF = eS y CaW (ns En) F 
n=1 


and therefore T = e’®S. 
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Proof of Proposition 15.47 Reasoning by contradiction, suppose that Y is a nontrivial 
closed subspace invariant under W and let Y+ be its orthogonal complement. The iden- 
tity W(x, €)* = W(—x, —&) implies that Y+ is invariant under W as well. By restriction 
we thus obtain two strongly continuous unitary representations Wy : H4 > #(Y) and 
Wy. :H¢ + ¥(Y+), and they satisfy 


Wy (0,0,t) =e"Iy, Wy. (0,0,t) =e". 


Lemma 15.52 and Step 1 of the proof of Theorem 15.48 imply that W (ag), Wy (ao), and 
Wy. (ao) are orthogonal projections of rank one in L7(R4,m), Y, and Y+, respectively. 
This leads to the contradiction 


yo ® yo = W(ao) = Wy (ao) + Wy (ao), 


as it represents the rank one projection yo ® yo as a sum of two disjoint rank one projec- 
tions. 


The final result of this section describes the Ornstein—Uhlenbeck semigroup in terms 
of the Weyl calculus. Let us first recall some notation from Theorem 13.58 (where a 
different normalisation of Lebesgue measure was used). The multiplication E, 


Ef(x) = exp(—zls?*) f@) 
is unitary from L7(R4, 7) to L?(R4m), and the dilation D, 
Df (x) := 24/4 ¢(/2x) 
is unitary on L7(R4,m). Consequently the operator 
U := DoE (15.28) 
is unitary from L?(R4, 7) to L?(R4,m). 


t 


Theorem 15.55. For all t > 0 we have, with s := Ge 
OU (t) = (1+s)4U-'W(a,)U, 
where W is the Schrodinger representation and 


as(x,§) = exp(—s(|x+|8|?)), «6 €R* 


Proof By the definition of the Weyl transform, the identity (15.25), and a change of 
variables, for a € L?(R*4,m) one has 


was) = [a(5(e+y).8) expl—iE (—y)) FG) dmx) dm(S), 
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By the definition of U, this gives the explicit formula 


U-'W(a)U f(y) 
1 
= Joua(s e+ 7s) 


x exp (iE (s— 5) exp(— 5a? + Fl?) ACVB) dnl) dnt 


= air ft G8) 


x exp (iE (>) exp(— shal? + jv?) F)am(E)am(s) 


=f. Koly.x) f(s) d(x) 


with 


Kala) = sapexp(—Glol?+ GbP*) [.a(S5-8) ex (6) amg. 


Applying this to the function a, we obtain 


1 1 1 
Kas(9.4) = sq5exP(—ghl? + gb’) 


xf exp(—s(l6P + gle-+y!?)) exp (65 


= sae ( lg , yf) cxo(— gle + gb ) 
xf exo(—s(16P +88 (2) ames) 
= s7aexP(~g-l—91?) exp(—$ r+?) exp(—Jls!?+ gb) 
x I, exp(—sin|?) din(7) 
= ps0 gia oo( feta) oo( jhe BP) 


ol 1 2/2. 112 _i 1 1» 
sacar xP(—g-(1—5)*(Ixl? + bl?) + (= —s)ay) exp(—slxl?). 


Therefore, 


U-'W(as)U f(y) 


=f Kas(os4) (2) d(x) 


es 2 (1 —s)*(ba!? +) + 5¢ 


s)xy) fed ax 
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e t 
1+e*? 


With s = <<, this identity implies 
(1+s)*U'W(as)U f(y) 
_ 1 ( 2 Wee 
~ 24(2n)4/2\ 1 +e) \1-et 
—2t —t 


1 e 2, Dan e 1 5 
xf exp(— 5 aa (le +b?) + pay) Fla) exp(— she?) ae 


=e 


=> oil 1 /2 1 ety — x? 
- (2n)4/2 (ja) ‘2 exp(—5 l—e-2 ) Fla)ax 


= [ MG.) Flea = OU (t) f(y), 


where 


1 1 d/2 1 ety — x|? 
Mons) = oaaplaee) (5 a) 


is the Mehler kernel; the last step used the Mehler formula (13.32) for OU (f). 


15.6 Second Quantisation 


Up to this point we have been concerned with the problem of first quantisation, namely, 
how to define the quantum analogues of classical observables. In order to arrive at a 
version of Quantum Mechanics that is consistent with Special Relativity, one must be 
able to describe systems with a variable number of particles. This is due to the fact 
that the equivalence of mass and energy makes it possible that particles are created and 
annihilated. If one uses a Hilbert space H to describe the pure states of a single particle, 
one postulates that the n-fold Hilbert space tensor product 


H®" :=H®---@H 
—_>S>=>s _—_’ 
n times 
describes the pure states of a system of n such particles. More precisely, as explained in 
Appendix B, we have a direct sum decomposition 


H®°" =1"(H)@A"(A) 


into symmetric and antisymmetric tensor products. A boson is a particle whose n- 
particle states are given by elements of I”(H) and a fermion is a particle whose n- 
particle states are given by elements of A”(H). We will discuss the bosonic theory only; 
the fermionic theory requires deeper tools from noncommutative analysis that would 
take us too far afield. The bosonic theory, moreover, has interesting connections to sev- 
eral other topics covered in this work. 
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The elements of the Hilbert space direct sum 


):= QI") 
neN 
correspond to superpositions of states carrying different numbers of bosons. The process 
of passing from H to '(#) is called (bosonic, or symmetric) second quantisation. The 
observation that every contraction T on H extends to a contraction (7) on (H) (see 
Section 15.6.c) allows us to establish a beautiful connection, for the special case H = 
K¢@, with the Ornstein—Uhlenbeck semigroup discussed in Section 13.6.e, namely, 


OU (t) =T(e*N) 


(Theorem 15.68). Under this correspondence, the negative generator —L of this semi- 
group corresponds to the number operator of Section 15.3.d (where a unitarily equiva- 
lent model of it was studied). Our study will also uncover a deep connection between 
second quantisation and the Fourier transform: over the complex scalars, the Fourier— 
Plancherel transform is unitarily equivalent to the second quantisation of the operator 
—iI (Theorem 15.70). Taken together, these facts connect the Fourier—Plancherel trans- 
form to the Ornstein—Uhlenbeck semigroup. Some connections of second quantisation 
with Number Theory will be discussed in the Notes. 

For simplicity we will limit ourselves to the case where the Hilbert space describing 
the pure states of a single particle is finite-dimensional. Mutatis mutandis, the theory 
generalises to arbitrary Hilbert spaces H if one replaces the Gaussian measure by a so- 
called H-isonormal process, a central object in Malliavin calculus. Although this gen- 
eralisation does not pose any mathematical difficulties we will not pursue it, as it adds 
a layer of abstraction that would only obscure the various connections just described. 

Unless otherwise stated the scalar field K is allowed to be either real or complex. 


15.6.a The Wiener-It6 Chaos Decomposition 
For h € K@ we define ¢, € L?(R4,y) = L?(R4y) by 


n(x) := (xh) = xe RR 


os 


Let (Hn)nen be the sequence of Hermite polynomials introduced in Section 3.5.b. For 
n € N we define 


Hi = paws {Hn( On) : |h| = 1}, 
the closure being taken in L?(IR“,y). Here, (Hn(@n))(x) := Hn(On(x)) = An((x|h)) for 


x € R@ The space .%, is sometimes referred to as the Gaussian chaos of order n. Note 
that .# = K1 is the one-dimensional space of constant functions. 
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We are going to prove that the subspaces .%, are pairwise orthogonal and induce a 
direct sum decomposition 
DH =L (R4). 
neN 


In a second step we will identify orthonormal bases for the summands .%,. 
To start our analysis, for h € K¢ we define the functions 


1 
Ky = exp(r <5 inl?) 


From exp(@),) € L?(R4, 7) we see that Kj, is well defined as an element of L?(R4,7). 
From (3.15) we see that 


1 h|" 
K, = ex Il lH) =) Hl An (n/n), EK. (15.29) 
neN “* 
In particular, 
1 
Ky, = Xin On) |h| = 1. (15.30) 
ne 7 


In (3.15) we worked over the real scalars; over the complex scalars, (3.15), (15.29), and 
(15.30) follow by analytic continuation. 


Lemma 15.56. The functions Ky, h € K4, span a dense subspace in L?(R4,7). 


Proof Suppose that f € L7(R%,7) is such that fea fKndy = 0 for all h € K4. Then 
ga f exp(n) dy = 0 for all h € K% Taking h := Yay cje;, with e; the jth standard unit 
vector of R@, we see that 


d 
(x)exp(}. ej) dy(x) =0 
Rd jal 
for all c),...,cqg € K. In case the scalar field is real, by analytic continuation we obtain 
that the same holds for all c),...,cqg € C. Taking c; = —iy; with y; € R, this implies that 
the Fourier transform of the function x f(x) exp(— 5 |x|?) vanishes. By the injectivity 
of the Fourier transform (Theorem 5.21) we conclude that f(x) exp(—3|x|?) = 0 for 
almost all x € R%, that is, f (x) = 0 for almost all x € R¢ 


Theorem 15.57 (Wiener—It6 decomposition). We have the orthogonal decomposition 
L’ (R41) =Q_™. 
neN 
Proof Fix h,h! € K@ with |h| = |h'| = 1 and s,t € R. Repeating the steps of (3.16), for 
all s,t € R we have 


(H(s, n)|H(t, On) 12 my) = exp(st (h|h')). 
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Substituting H(r,-) = Yeo "Hk and exp(st (h|h’)) =YVe-5 ay ((h|h’))*, and taking the 


gntm 
partial derivative san 


mga ats =t = 0 on both sides of the resulting identity, we obtain 


(Am (i) | Hn (n’))12.¢R4.y) = Onn! (h|h')". 


Noting that 5nqn! = SnnVm!Vn!, this can be equivalently stated as 


a -_ Wn 
Gran en) = Sinn (h|h’)”. (15.31) 
For m £n this implies ZH, | A. 

If f | Mp, for alln €N, then (f|Hn($n))72(@¢,y) = 0 for all h € K¢ with |h| = 1, and 
therefore (15.29) implies that (f|Kj,))2(ga,,) = 0 for all h € K¢4, and therefore f = 0 by 
Lemma 15.56. 


The next result shows that the Wiener-It6 decomposition diagonalises the Ornstein— 
Uhlenbeck semigroup OU on L?(R4,7) introduced in Section 13.6.e: 


Theorem 15.58. The following identities hold: 


(1) for allh € K4 andt > 0, 
OU (t)Ky = K,-1p; 


(2) forallneN, F € AH, andt > 0, 
OU(t)F =e "F. 


Proof Completing squares in the exponential, for all h € R@ and t > 0 we have 


[,20(Vv1= e010) dy(y) = [jg Lo(Vi-; ~All ?) dy, 


= [ens 40) = exp(5(1 eal). 
Hence, from the definitions of OU(t), @,, and Kp, 
OU) Kilx) = [exp oule t+ Vey) — 5H?) dry) 
= exp(e"(ela)— S1n?) f exp(Vi= e018) arty) 
mele Gila Wao j tte 
= exp((xle"h) — sle“thP?) = Ken. 


This gives (1) when K = R. When K = C, the result follows by analytic continuation. 
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If |4| = 1 and s > 0, it follows from (15.29) and the preceding calculation that 


OU (0) Y © Ha (4) = OU (Ky = Keasn = Hay) 


neN neN 


Taking nth derivatives in s and evaluating at s = 0, we obtain the identity 


OU (t) n(n) = e "A, (On). 


By linearity and taking limits, this gives (2). 


Part (2) of the theorem implies that each summand .%, is contained in D(L) and 


LF=-—-nF, FE Ay, neEN. 


Over the complex scalars, Proposition 10.32 can be applied and we obtain: 


Corollary 15.59. o(—L) =N. 


15.6.b The Wiener-It6 Isometry 


Our next aim is to find an orthonormal basis for each summand .%,. This will be 
achieved in Theorem 15.60 by means of multivariate Hermite polynomials. 


For n = (n,...,7%) € N* we write 
k k 
|n| = Yi nj, n!:=[[nj!. 
j=l j=l 


Theorem 15.60. Leth = (hj)$_1 be an orthonormal basis for K“ For each n € N the 
family 


{ Halon) -neN4, Inj = nf 


defines an orthonormal basis for Ay. As a consequence, the family 


{ FHalon) -neNn*} 


defines an orthonormal basis for L?(R4,Y). 
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Proof The proof is divided into three steps. 
Step 1 — First we prove that the family {Jaq Hn(On) : n € N“} is an orthonormal 
system in L7(R4, 7). Let m,n € N@. By separation of variables and (15.31), 


dd 1 1 


(FHm(On)| Halon) = [TTT (Gag (On) Fit) 


dd d 
= TTT on, (Ai|hj)"¥ = Torin; = omn- 


i=1 j=1 j=l 


(15.32) 


_— 


Step 2 — Next we prove completeness of this system in L7(R¢,y). Suppose f € 
L?(R4,7) is such that (f|Hn(on)) = 0 for all n € N@. Fix an arbitrary h € K¢ and put 
gei= Yi-o 94, . Then limy_s.. 2% = exp(;,) in L?(R4, y) by dominated convergence. By 
writing h = Yin cjhj we see that each g,; is a polynomial in oy, ,...,@,,, and such poly- 
nomials are linear combinations of the functions Hn(@n) for appropriate multi-indices 
n € N@. It follows that (f|g,) = 0 for all k € N. Passing to the limit k — it follows 
that (f|exp(@,)) = 0, and therefore (f|K;,) = 0. Since h € K@ was arbitrary, Lemma 
15.56 implies that f = 0. Together with Step 1, this proves that {Hy(@n) : n € N“} is 
an orthonormal basis of L7(R4,7). 

Step 3 — The final step is to prove that {Faq Hn(On) :n € N4, |n| = 7} is an or- 
thonormal basis for .#%,. Denote by Y, the closed linear span of the set {Hn(@n) : n € 
N4, |n| = n}. By Step 1, Gn 1%, if m An. If h = Y4_,cjhj € K4 and 0 <m<n, 
then Ain(@,) = Hn(L4a1 € i9n;) is a linear combination of polynomials of the form 
Ax(On) with |k| < m, and therefore Hn(@n) € Bi_; Y;. In particular this implies that 
Hon © BL GY; © Pi) G; and therefore 


Also, by Step 1, 


It follows that %, C G,. This being true for all n € N, by Step 2 it follows that 


L’(R4 7) = DH iS DZ =17(R4,7). 


neN neN 


From this we infer that #4, = G, for alln EN. 


The orthogonal projection in L?(R4, y) onto .%, will be denoted by J. 
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Corollary 15.61. For alln € N and h € K¢ with |h| = 1 we have 
In( Oi) os An(on)- 


More generally, if h,,...,hx are orthonormal in K¢ andn= (11,..-,Mk) € Né then 


Jin (On, a: One) = An, (Oi ) oa An, (On,)« 
Proof We have 


An, (n,n, (ng) = on ++ Gy +4(Ohy > +++ 5 Pry) (15.33) 
where g is a polynomial of degree strictly less than |n|. Hence Theorem 15.60 im- 
plies that g(My,...,he) € er H€;, and the corollary follows by projecting (15.33) 
onto Aq. Since An, (Qn, ) +» An, (On, ) © An}, the left-hand side remains unchanged; 
the right-hand side is mapped to Jin, (7, ve Onn): 


Corollary 15.62. For all h € K4 with |h| = 1, 


Ki = Yh 0f). 


neN'* 


Proof This follows from the previous corollary and (15.30). 
Proposition 15.63. For all g1,...,g) € K@ and h,...,hn € K4 we have 


(In( er *** Pen )ln(Ony ** Pin) = YY (8ilho(ay)+** (Snltoeny)s 


OESy 


where S,, is the group of permutations of {1,...,n}. 


Proof Let (e ea be the standard basis of K%, Choose nonnegative integers ¢1,...,¢; 
and m ,...,mm such that ¢; +---+@; =m, +---+ mz =n. By adding extra zeroes to the 
shortest of these two sequences we may assume that j = k. By Corollary 15.61, 


In(Geh a OE )= Ae, (Gei, Ise “Hy, (e;, )> 


Cin 
Jn( O22) = ny (Ges, Hn (Ge) 


Hence, by (15.32), 


L Ly m m _ 
(In( Ge; a en Mn (e; ve ein) =. m! dp, m, + Oo mg: 
On the other hand, with 
81S = 8h, = ei tee Sly ttl tl Ft = 8b 44+ = Si 
1 > ? 1 k-1 1 k Ik (15.34) 
hy = =hn, SS Ci, eee y Pn tty +1 Sr SIME hl, = Cig 
we have 


y (g1 lAgc1)) ne (8n|Ao(n))) 


OESp 


15.6 Second Quantisation 605 
Y> (ei \Ao(1y) + (i, Ao)“ 


OCSn 


(Cig Meo (ey 4-461 41)) 7 (Cig Ao ey 4-461 +&)) 
=m !.. “mx! i Oe, my aa Op,m al m!dp, m, ro Ob, mp: 
This proves the corollary in the special case (15.34). By n-linearity, the corollary then 


also follows if each of the functions f and g are finite linear combinations of such 
expressions, and finally the general case follows by density. 


The main result of this section relates the spaces .#%, to the n-fold symmetric tensor 
product of K¢. The n-fold tensor product 
(Ke = Kia ae @K4 
eS SS 
n times 


is a Hilbert space with respect to the inner product 


Lom 
(x 2? @-- @gW) ya @---@AY) = yd (eh An). (15.35) 


j=l 


We identify (K¢)®° with the scalar field K. From Appendix B we recall that the n- 
fold symmetric tensor product of 4, denoted by I” (K@), is defined as the range of the 
orthogonal projection Pr, € 2 oa )®") given by 


Pr, (21 @ +++ @ In) + Y, how) @--@hoin),  My---yhn € KS 
! eS, 
and extended by linearity, where S,, is the group of permutations of {1,...,n}. Equiva- 


lently, IK“) is the subspace of those elements of (K“)®” that are invariant under the 
action of S,. 

For the formulation of the next theorem we introduce the following notation. Let 
h= (hia be a finite sequence in K¢. For n € N‘ with |n| =n let 


@n .__ 7,8 @n 
h mn =i @-- @hy K 
where ie =h;®---@h; (n; times) with the convention that h®° = 1, Similarly we let 
n._ oy ga 
Theorem 15.64 (Wiener—It6 isometry). There exists a unique isometric isomorphism 
W :T(K*) > V-(R4) 


with the following property: For every 1 <k < d, every orthonormal systemh = (hj es 


in K4 every n €N, and every multi-index n € N* with \n| =n, 


W (Pr, (ho")) = —=Hn(on), me NE 


1 
val 
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This mapping W will be referred to as the Wiener—It6 isometry. To appreciate the role 
of the scalar field, one should keep in mind that L7(R4,7) always means L?(R4, 7; K); 
hence, over the real scalars, the Wiener-It6 isometry gives an isometric isomorphism 
W :T(R4) > L?(R47;R); over the complex scalars it gives an isometric isomorphism 
W :T(C4) = (RG 7;C). 


Proof Uniqueness being clear, we concentrate on existence. 
Let e = (e Nat be the standard basis in K“ For every integer n € N consider the 
linear mapping W,, : I" (K“) > % defined by 
1 
val 


and extended by linearity. We begin by showing that if h = (hie, is any orthonormal 


Wr: Pr, (e228) Ha(de), ne N%, |n| =n, 


system in K¢@ then 
1 
vn! 


As a second step we show that W, is an isometry from I (K“) onto %,. In view of the 
Wiener-It6 decomposition, these two facts prove the theorem. 


W,,(Pr, (h®")) = ——Ha(on), me N% |n| =n. (15.36) 


Step 1 — Let h; = Yi cijej, i= 1,...,k, be the expansion in terms of the standard 
basis. Let n € N* be a multi-index satisfying |n| =n. Then, 


W, (Pr, (h°")) =W (n. (Save Pu Q---@ (Zeuei)"") 
i= j= 


|m|=n inf 
1 d 
= am Jai tin (Pe) = al 2 om LL Fn (Ge) 
1 d Pee ; . 
= Fi eo )= Td 2, »[ 19%") 
k d is 
- Tah E ame = sah (11(, 0%.) ) 


where the coefficients am, m € N% are determined by the identity 


k 


\ am6™ = Ts cejbj) 


|m|=n t= \j=1 
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in the formal variables €),...,€ and €™ = &/"'--. 6". This establishes (15.36). 


Step 2 — In this step we show that the mappings W, are isometric from I” (K%) onto 
He, First let hy,...,n € K4 be arbitrary. By Proposition 15.63 we have 


lJn(Ons --- Onn IP = YY Ur lto(ay) ++ MnlFto(ny) 


OESn 


= PL (1 8---@ Malte) ®--- @hoqa)) 


OESn 


=n\(hy ®-+-@hq|Pr, (1 @ +++ @ hn) =n! |[Pp, (ty @ ++ @ hy) ||’, 


the projection Pr, = PE, being orthogonal and hence selfadjoint. Specialising to the 
standard basis of K@ and using Corollary 15.61, we obtain 


| , 1 
Pr. (e™™)IL = [Pry (ep @-- @eg™)| = Jat Mal et be4 Jl 


= Sell (Bes) Hay (Bel = a 


This identity extends to finite linear combinations by the Pythagorean identity, noting 
that both on the left and on the right the contributing terms in the sums are orthogonal. 
This proves that the mapping in the statement of the proposition is an isometry. Since 
the multivariate Hermite polynomials of degree n form an orthonormal basis in “%,, this 
isometry is surjective. 


|| In (Ge) | 


As a special case, note that for all h € K¢ with |h| = 1, 
1 


W(h®") a Ta 


Hn ($n): (15.37) 
15.6.c Second Quantised Operators 
For a linear operator T on H we obtain a linear operator T®” on H®” by 
T°" (hy @ +++ @hn) = (Th, @--+@Thy) 
and linearity. For n = 0 we understand that H®” = K and T29 = Ix, where K is the 
scalar field. 
Proposition 15.65. If T is a bounded operator on H, then T®" is a bounded operator 
on H®" of norm 
I7"|| = I". 
Proof By ascaling argument it suffices to show that if ||7'|| = 1, then ||7®"|| = 1. The 


inequality ||7®”|| > 1 being obvious from the definition of the inner product on H®", 
we prove the inequality ||T®”"|| < 1. 
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If the scalar field is real we denote by Hc the complexification of H. Endowed with 
the norm n+ ih li, := ||A||? + ||A'||*, this is a complex inner product space. If T is a 
contraction on H, then Tc¢(h+ ih’) := Th+iTh' defines a contraction on Hc of the same 
norm. Noting that (Tc)®” = (T®")c, it suffices to prove the proposition in the case of 
complex scalars. 

If T is unitary, then 


P=¥ Yew] (TAY | Tr) 


k : F 
|r" y cjh) @-- @ny) 
j=l 


i=1 j=1 m=1 

k ok (/) 
ZN yca |: (A jal? )=|[ ent" @- @ hy l 
m1 y=! m=1 


and therefore T®” is an isometry on H®”. The corresponding result for contractions 
follows from the fact that every contraction T on H can be represented as a convex 
combination of four unitaries by Lemma 14.25. 


Restricting T°” to the symmetric part ’,(H) of H®", we obtain well-defined con- 
tractions T’,(7) on I’, (H) of H®". By taking direct sums, 


T):=Q@I"(7) 
neN 
defines a contraction on @,enI"(H). 
Definition 15.66 (Symmetric second quantisation). The Hilbert space completion '(H) 


of Bycnl" (A) is called the symmetric Fock space over H. When T is a contraction on 
H, the contraction (7) on P(#) is called the symmetric second quantisation of T. 


Antisymmetric second quantisation can be defined similarly but will not be studied 
here. Because of this, we will omit the adjective ‘symmetric’ from now on and simply 
talk about second quantisation. 

If S and T are contractions on H, their second quantisations satisfy 


r)=1, T(ST)=TOr), Ter) =r) (15.38) 


In what follows we take again H = K¢4 If T is a contraction on K%, via the Wiener_It6 
isometry (Theorem 15.64) the operator '(T) induces a contraction on L?(R4, 7) which, 
by a slight abuse of notation, will be denoted by I(T) as well. It is easily checked that 
(15.38) holds again. 


Lemma 15.67. [fT is a contraction on K4, then for all h € K4, 


T(T) Ky = Krn- 
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Proof Denoting by W the Wiener-It6 tig for all h € K¢ with Th 4 0 we have 


i |Th 


Th 
Krn= ). Pl ral (Orn/\Th|) = De, 
neN . 
=) — aw (Th)") =w wD —,(T)A™ ) 


W((Th/|Th\)") 


nen 
=(eonE Se) =ry(w(Z Sa) 
=P) YS A(Os) =C (DK, 


where the first identity follows from (15.29) and the second and penultimate steps follow 
from (15.37). If Th = 0, then Kr, = Ko =1=T(T)Ko. 


As a special case, for the Ornstein—Uhlenbeck semigroup we obtain: 


Theorem 15.68 (Ornstein—Uhlenbeck semigroup and second quantisation). Under the 
Wiener—It6 isometry, for all t > 0 we have 


OU (t) = T(e'l). 
Proof This follows from Theorem 15.58 and Lemma 15.67, which give 
OU (t)K), = K,-1,V (e '1)Kn, 


and the density of the span of the functions K;, in L7(IR4, 7) shown in Lemma 15.56. 


Over the real scalars we have the following positivity result. 


Theorem 15.69 (Positivity). If T is a contraction on R4, then T(T) is a positivity pre- 
serving contraction on L?(R4,/). 


Proof By Lemma 15.67, for all h € K¢@ we have I(T )K), = Krp > 0. Moreover, for all 
cER, 


I(T) (exp(cOn)) = P(T)(exp(Gen)) = exp(5¢ lhl) UT) Ken 
= =exp(5 ce |hl )K Th 
= exp(cOrn-+ 5 (\hl? —|TAP)). 
By analytic continuation this identity extends to arbitrary c € C. 


Let 0< f € #?(R¢) = {fe LRA NL (R4): fe L'(R4)NL?(R4)}. By Fourier 
inversion, 


1 rs 
PO = Beara fy FE1---- Sn) (T exp (ids) a6 
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: f( 1 
= ay? yy AErr---s5ndexr (irs = sig? Irél?)) dé. 


In view of Or (x) = (x|T§) = (T*x|g ), by dominated convergence we have I'(T) f(-) = 
lime oF ¢(T*-), where 


Fe(E):=Fle)ge(E) with ge(E) = exp(—5 (Ig? - 1 —e)ITP)). 


By taking inverse Fourier transforms and applying Lemma 5.20, we conclude that 
(22)4/ °F. = f *&,. If we can prove that the %, is nonnegative almost everywhere on 
R¢ it follows that I(T) f > 0 almost everywhere on R¢ 

Since T is a contraction we may write 


JE? —(1—e)|TEP = (1) T°) 66) = ||De§ |, 


where De := (I — (1—€)T*T)!/? is invertible. Hence, 


<a exp(—5|IDeE|I?) explix-€) a8. 


B(x) = (2m) - 


After a change of variables, the right-hand side can be evaluated as a Fourier transform 
of a Gaussian and is therefore strictly positive on R4@ 


15.6.d The Segal—Plancherel Transform 


In this section we discuss a Gaussian analogue of the Fourier—Plancherel transform F, 
the so-called Segal—Plancherel transform W on L?(IR4, 7). We work over the complex 
scalars. As before we denote by 


1 
dm(x) := ——, dx 
the normalised Lebesgue measure. If we reinterpret the Fourier transform as an operator 
from L!(R4,m) to L*(R4,m), 


F f(E)= i ,exp(—ix-8)f(x)dm(x), € ERY fe L'(R4m), 


its restriction to L!(R4,m)  L?(IR4,m) extends to an isometry on L?(R4,m). In the 
present section, the term Fourier—Plancherel transform will refer to this operator. 

As in Section 15.5.d we let U := Do E, where D : L?(R4,m) > L?(R4,m) and E : 
L?(R4,y)  L?(R4,m) are the unitary operators 


Df (x) = 244 f(V2x),  Ef(x) = e(x) f(x), 


with e(x) := exp(—4\x|"). 
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Theorem 15.70. The mapping W : f ~ Wf, defined for multivariate polynomials f : 
IR? — C and analytic continuation by 


= [ f-ix+ V2») x69), xe RA, 
extends to a unitary operator on L? (IR, y) and we have 
W =U 'o Fou. (15.39) 


Proof Since D, E, and F are unitary, the unitarity of W will follow from the operator 
identity (15.39). To prove this identity, we substitute 7 = /2y and € = —ix+7 to obtain 


W $0) = ear [,f(-H+ V2s)ex0(—5 bP) & 
1 d 


, 1 
—Giee. n) [Texe(—gnj) an 


d 
= Gay [ ix+R4 FG) Tex(-3(6 + ix;)*) dé 


= amar [Ser iléP six iP) a8. 


To justify («) it suffices, by writing f as a linear combination of monomials and sepa- 
rating variables, to show that for any k € N andx € R, 


Toccw Bex (— GE +0") 8 = f EFexp(—F(E +in)?) a8. 


But this is clear by Cauchy’s integral formula and a limiting argument using the decay 
at infinity. Hence, 


(EoW cE") f(x) = amas *?(~ 4h") 


x [exp (ISP) sG)exp(—FIEP = 518 x gh?) a8 


= Gaya [fe dexw(— 5 -») 06 
= pan is 24? F(2)exp(—i§ -x) dm) 


= (F oD") f(x) = (Do F oD) f(a), 


where we used that ¥ oD = D7! 0 ¥. 
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We have the following representation of WY in terms of second quantisation: 
Theorem 15.71 (Segal). W =I (—il). 


Proof Let f : R4 > C be a multivariate polynomial. Mehler’s formula and Theorem 
15.68 tell us that for all t > 0, 


Pe f(a) = OVW Fl) = f Fletx+ VI—e%y) ax6). 


By analytic continuation we may replace t > 0 by any z € C with Rez > 0. Now let 

dint 

> 5M. 
2 


The preceding two theorems combine to the following result. Recall the definition of 
the unitary operator U of (15.28). 


Corollary 15.72. The Fourier—Plancherel transform is unitarily equivalent to T(—il). 
More precisely, we have 
U-!o FoU =T(-il). 


Here, U is as in (15.28). This gives a neat “explanation” of the identity 74 = I: by 
the multiplicativity of second quantisation it follows from the identity (—i)* = 1! 


15.6.e Creation and Annihilation 
For h € K@ and n € N the (bosonic) creation operator a\(h) :T"(K4¢) > "+! (K¢) is 
defined a 
h) Y hgi1) +++ @he(n) 


he 
n+l 


— hoi 
= Dd dha 


and the (bosonic) annihilation operator dn41(h) :T"*!(IK¢) + I" (K¢) by ao(h) := 0 
and 


1) 8°: ‘Oh, o(j- 1) @h@h,e () O° @heny, 


anzi(h) YY hey @--@how+t 


OESn +41 


1 n+l 


= Jel py X (hots (|Z) he 1) @° -@he; o(j) 877 @he(n41) 
OSn+1 J 


using the notation ~ to express that this term is omitted. These operators are well defined 
and bounded, and their operator norms are bounded by 


Ilan (A) Il arm cca rt (cay) = [ang (A) Il. crm cay rm cxayy S Cal | 
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with constants C,, depending on n only. The first equality follows from the duality 
ait (h) = ans (h). 
Furthermore, a straightforward computation gives the commutation relations 
an(h)ay., (th) — a} (h)an 1 (h) = |A/?T. 


Let (e a1 denote the standard basis of K“ Using the Wiener-It6 isometry we may 
define operators A and A’ as densely defined operators from L7(R4, 7) to L?(R4 7; K“) 
and from L?(R4, y; IK“) to L?(R4, 7), respectively, by putting 


AW (x) := (W(an(e1)x),-.-,W(an(ea)x)), x €T"(K%), n EN, 
and 
At (W(x1), .. W(xa)) = ak(ey)xy+---+ah(erx1, 1,---,%¢ € I" (K4), nN. 
The operators A and A’ are dual to each other in the sense that 
(Afls) = (f1A"s), SE Ma, 8 EM (15.40) 


The identity (15.40) easily implies that A and A’ are closable. From now on we denote 
by A and A’ their closures and by D(A) and D(A‘) the domains of their closures. Let V = 
(O),...,0q) be the gradient, viewed as a densely defined closed operator from L?(R4, 7) 
to L?(R4,y;K%) with its natural domain D(V) = H!(R4,y), the Hilbert space of all 
functions in L7(IR4, y) admitting a weak derivative belonging to L7(R4, 7). 


Lemma 15.73. The space Pol(R?) of polynomials in the real variables x,,...,Xq is 
dense H'(R4,y). 


Proof We sketch the main line of argument and leave some tedious details to the reader 
(cf. Problem 15.16). As we have seen in Section 13.6.e, the densely defined closed 
operator associated with the sesquilinear form 


coulf.8) = |. VF-Vedr), f.8€ D(aou), 


with D(agy) = H'(R4,7), equals —L, where L is the generator of the Ornstein—Uhlen- 
beck semigroup OU on L7(R4,7). We claim that D(L) is dense in D(agy) = H'(R4,). 
This is a special case of a general density result mentioned in the Notes to Chapter 13 but 
can be proved directly as follows. Since the Ornstein—Uhlenbeck semigroup is analytic 
(Theorem 13.57), for all f € L7(R4, 7) andt > 0 we have S(t) € D(L) by Theorem 13.31. 
In particular this implies S(t) f € D(agy) = H'(R4, 7) and it suffices to prove that 


lim |OU(®)F ~ flay) =0 
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for all f€ H 1 (R4, y). For this, in turn, it suffices to check that for all such f we have 
Be OU im VF lli2ceayKa) = 9. 


Writing g := 0;f, this follows by differentiating under the integral in the definition of 
the Ornstein—Uhlenbeck semigroup: 


a,0U(t =f, aif(e"(-) + V1 —ey) dy(y) 
eat ‘a ile) +V 1 ey) dy(y) = TOU (1) 05 f. 


The definition of the domain of the operator associated with a form, combined with 
a straightforward computation, shows that Pol(IR“) is contained in D(L). We will show 
that Pol(IR“) is invariant under the Ornstein—Uhlenbeck semigroup. Once this has been 
established, Proposition 13.5 implies that Pol(IR“) is dense in D(L). 

To prove the invariance of the space Pol(IR“) under the Ornstein—Uhlenbeck semi- 
group, first let f be a monomial of the form 


FQ) axl, xe RS (15.41) 
with kj € N for all j = 1,...,d. For y-almost all x € IR? we have 
ous) = [fle txt Vi-e%y) ayy) = BG). 


Substituting the expression (15.41), by direct evaluation we see that F; € Pol(IR“). The 
desired invariance follows by taking linear combinations. 


Proposition 15.74. We have A = V with equality of domains. 


Proof First let f = Fain (On) with n € N and |h| = 1. This is the image under the 


Wiener-Ité isometry of the element h®” € I”(C%) and we have 


(Af); = (AW(A")) | = W (ane) 2@--- @h) 


n times 
a» (hle;)W(h@---@h) 
fl —1 times 
= Vn hle;)W(h®---@h) = "les, Hyn—1(on)- 
n—1 times (n— 1)! 
On the other hand, since H/, = nH,_; and 0p = (hle;), 
1 nN 
(9); = L.9;H,(6,) = (le) Hs-1(0n)- 


PO O)= Tomi 
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This proves that Af = Vf for all f € # and n € N. Since the linear span of functions 
f € A, n €N, is dense in D(A) by definition, and since V is closed, this gives the 
inclusion A C V. 

To prove equality A = V it remains to be shown that the linear span of functions 
f € H,nEN, is also dense in H! (R47). For this purpose we recall from the proof of 
Theorem 15.60, in the special case of the standard unit basis of R¢@ that for each n € N 
the linear span of the polynomials H,,(@y), |2| = 1, equals the space of all polynomials 
of the form x ++ Ay, (x1) +++ An, (Xa) with ny +-++-++ng =n. Their linear span when 
n ranges over N equals the space Pol(IR@) introduced above. This space is dense in 
H'(R4,y) by Lemma 15.73. 


In what follows we work over the complex scalars. For j = 1,...,d and f € Pol(R¢ 
we let ajf := (Af); and aif := A‘(0,...,0,f,0,...) with f in the jth place. By (15.40) 
these operators are dual to each other, in the sense that with respect to the inner product 
of L?(IR4, Y) we have 


(ajflg)=(flaig), f.g € Pol(R*). 
Define the position operator Q = (q1,...,qa) by 
i (aj-+a’) 
; i= —(a;+a'). 
qj V2 jd; 


The choice of the normalising constant 1/./2 in the definition of q ; May appear unnat- 
ural. The reason for this choice will become apparent in (15.42), (15.43), and (15.46). 
Viewed as an operator in L7(R4,y) with dense initial domain Pol(IR“), this operator 

is symmetric and therefore closable. We claim that its closure, which we denote by q; 
again, is selfadjoint. First we claim that, for almost all x € R¢ 

1 
v2 
Indeed, since by Proposition 15.74 we have a; = fe) );, the directional derivative in the 
direction of e;, it follows that 


V2(aif ls) = (flag) + (ifl8)- 
Suppose now that f,g € Pol(IR“). Then, with x = (x),...,xa), 


aif (x) = —axjf (x), f € Pol(R*). 


= 1 — 1 
fle) =f Pa8y= tan [£008 eaexr( — 5°) ds 


= oars | liste) —asr(o)leJexp(—3?) ae 


= —(0;f |g) + (xjflg)- 
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It follows that (0; f + 07) f(x) =x; f(x). This proves the claim. The asserted selfadjoint- 
ness of qg; is an easy consequence of this claim. 
In a similar way we define the momentum operator P = (p1,...,pj) by 


ae t 
Pji= Pie ma 


Again this operator, initially defined on functions in Pol(IR“), is symmetric and hence 
closable, and its closure is selfadjoint. 

The identities below are understood in the sense that they hold when applied to func- 
tions in Pol(R“), Some additional details are addressed in Problems 15.17—15.20. From 
the commutation relation [a;, a’ = I we have 


ae ee 
[pj,41] = 5; ((ar—a})(aj+a})— (aj +a4)(aj—ai)) 
; (15.42) 
7 (434; ajaj)=—1 
as well as the identity 
1 1 ae 
5 (Pi +4) = —Glaj—aj)” + gap taj 
1 1 (15.43) 
= 5 (aja +4j4)) =a\aj 5[aj,4)] =ajaj+ sf 


As is checked by an easy computation, in terms of the annihilation and creation op- 
erators, the Ornstein—Uhlenbeck operator is given by 


d 
-L=V‘V=AtA= ¥ aay, (15.44) 
j=l 
so that by (15.43), 
ae | a) Gi 
LS 4+=(P?407)==— 4 7 +45 15.4 
pg Oa Loli qj); (15.45) 


again in the sense that these identities hold when the operators are applied to functions 
in Pol(R“). The operators P and Q intertwine with the momentum operator D = iV and 
the position operator X, in the sense that 


1 
UogjoU | =x;, UopjoU"! = —d;, (15.46) 
i 
with U the unitary operator of Sections 15.5.d and 15.6.d. These relations are easy to 
check by explicit computation and justify the terminology ‘position’ and ‘momentum’ 
for gq; and pj. In this way we recover the unitary equivalence, established in Theorem 
13.58, of —L+ $ with the quantum harmonic oscillator. 


15.1 
15.2 


15.3 


15.4 


15.5 


15.6 
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Problems 


Prove the assertions about orthogonal projections in Section 15.1.c. 

Prove that if P and Q are orthogonal projections on a Hilbert space such that R(P) 
is contained in R(Q), then Q = PV (QA-P). 

Let @: @(H) > C be a state, let (Ay)n>1 be an orthonormal basis for H, and let 
P,, denote the orthogonal projection onto the span of the set {/,...,/,}. Prove 
that for all T € (H) we have 


o(T) = lim 6(P,7). 


noo 


Hint: Apply the Cauchy—Schwarz inequality to the mapping (T,U) + $(TU*) 
and take U :=I— Py. 

Consider a qubit in state @|0) + B|1), where a,B € C satisfy |o|? +|B|? = 1. 
Compute the probabilities that upon measuring the spin in direction j € {1,2,3} 
we find ‘up’, respectively ‘down’. 

We take a closer look at the Pauli matrices 0), 02, 03. 


(a) Show that the complex exponentials of the Pauli matrices are given by 
exp(i0.0;) = (cos@)J+i(sin@)oj;, j=1,2,3. 
(b) Show that if v is a unit vector in R?, then for all n € N we have 
(oy = I, n even, 
v-o, nodd. 
Use this to prove the identity 
exp(i@v-o) = (cos @)1+i(sin@)v-o. 
Furthermore show that det(exp(i@v-o)) = 1. 
(c) Conclude that 
{exp(i@v-o) : 6 € [0,2z]} = SU(2), 
the group of unitary matrices acting on C? with determinant 1. 
Prove that if U is a symmetry of H, then the mapping ty : Y(H) > Y(H) given 
by ty (P) := U*PU enjoys the following properties: 
@ what, 
(ii) for all P€ A(H) we have ty (=P) = =ty (P); 
(iii) for all Pi, Py € A(H) we have 
Tu (Pi A Py) = Ty (Pi) A Ty (P2); 
Ty (Pi V P2) = tu (Pi) V tu (Pa). 
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15.7 Show that position and momentum are covariant with respect to rotations Rp on 
L?(R4,m) given by Rp f(x) = f(p~!x), where p € SO(d), the group of orthogonal 
transformations on R¢ with determinant 1. 

15.8 For G=Z/2Z, determine the position and momentum operators on L?(G) ~ C*. 

15.9 Find the projection-valued measures associated with the selfadjoint operators x; 
and €; discussed in Section 15.5.b. 

15.10 In this problem we prove Wintner’s theorem: There exists no pair of bounded op- 
erators S,T € &(H) satisfying the Heisenberg commutation relation ST — TS = 
iJ. We may absorb the imaginary constant i into one of the two operators and 
consider the identity ST — TS = 1 instead. Assuming that S,T € &(H) satisfy 
ST — TS =I, obtain a contradiction by completing the following steps. 

(a) Show that for all n = 1,2,... we have S°T — TS” =nS"~|. 
(b) Deduce that S”-! 4 0 and n||S"~"|| < 2||S”"—"|||| S| |||]. 

15.11 The aim of this problem is to prove that for a linear mapping @ : 7(H) —> C the 

following assertions are equivalent: 
(1) o(T) = vii (Txjly;) for suitable k > 1 and x1,...,x%,y1,---,¥e € A; 
(2) @ is continuous with respect to the weak topology of 2(H); 
(3) @ is continuous with respect to the strong topology of 2(H). 
(a) Prove the implications (1)=(2)=(3). 
The remainder of the problem is devoted to a proof of the implication (3)=(1). 
(b) Show that (3) implies that there exist x,,...,x, € H such that 
|9(7)| < max ||Txjl|, T€ 2H). 


SJ 
(c) Let K be the closure of the subspace {(7x1,...,7 xx) € H*: T € L(H)} in 
H* Show that the linear mapping y : K > C defined by 
w(Tx1,...,Tx~) = O(T) 
is well defined and bounded. 
(d) Using the Riesz representation theorem, show that @ is of the form as in (1). 


15.12 Prove that for a linear mapping @ : #(H) — C the following assertions are equiv- 
alent: 


(1) there exist sequences (xp)n>1 and (¥n)n>1 satisfying 


Y llanll? <2 and)? |Iynl|? <2 
n>1 n>1 
such that for all T € &(H) we have 
o(T) =) (Txalyn); 


pal 
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(2) @ is continuous on B 'y(H) With respect to the weak topology of “(H); 
(3) @ is continuous on B 'v(H) With respect to the strong topology of -“(H); 
(4) @ is normal. 


If @ is positive and satisfies ¢ (7) = 1, these conditions are equivalent to: 


(5) there exists an orthogonal sequence (x»)n>1 satisfying ¥,,>4 ||xn||? = 1 such 
that for all T € &(H) we have 
o(T)= a (TXn|Xn). 
n>1 


15.13 Using the functional calculus for projection-valued measures on T we may define 
6:= ik arg(z) d@(z), 


where the projection-valued measure © : A(T) + A(L?(T)) is the angle ob- 
servable of Section 15.5.c, There is some ambiguity here as to how to take the 
argument; for the sake of definiteness we take it in (—7, 7]. 


(a) Show that 6 is bounded and selfadjoint on L?(T). 
(b) Show that for all f,g € L?(T) we have 


(@f\g) = x/. 0 f(e®)g(e®) do 


Define the angular momentum operator as the selfadjoint operator 7 defined by 
the angular momentum observable L: 4(Z) + A(L*(T)) of Section 15.5.c, 


i= ys nbn}. 


(c) Show that, with an appropriate choice of domain, Tis selfadjoint on L?(T). 
(d) Prove that 6 and / satisfy the Heisenberg commutation relation 
16-@1=i1 
on D(76) n D(67) and show that this domain is dense in L?(T). 
The operator ) appears to be of little use in Physics. This is related to the failure 
of the ‘continuous variable’ Weyl commutation relation for 0 and /: 
(e) Show that there exists no bounded operator T on L?(T) such that the follow- 
ing identity holds for all s,t € R: 


ebT pitl pee eiSt pitl oisT (15.47) 


Show that the same conclusion holds if we assume that T is a (possibly un- 
bounded) selfadjoint operator. 

Hint: Show that if an s € R exists such that the identity in (15.47) holds for 
all t € R, then s € Z. 
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(f) Prove a similar result for the phase operator of Section 15.3.d. 


15.14 Show that if T is a contraction on R4% then for every 1 < p < ~ the second 
quantised operator I'(7’) extends to a contraction on L?(R4,7). 
15.15 Show that if U is unitary on C4, then for all f € L?(IR4,) we have 


TU) f(x) = f(U*x) 


for almost all x € R4“ 

15.16 Complete the details of the proof of Lemma 15.73. 

15.17 Complete the details of the proofs that the position and momentum operators qj; 
and pj; are selfadjoint on L?(R4,7). 

15.18 Prove the commutation relation [a Fall = I used in the proof of (15.42). Also 
prove that if j # k, then [a;,a)] =0 and [P;.41] =0. 

15.19 Show that the operator —¢ + 5(P? + 07); considered in (15.46) as a densely 
defined operator in L7(R4,y) with domain Pol(IR“), is closable and its closure 
equals —L. 

15.20 Show that the position and momentum operators q; and p; introduced in Section 
15.6.e satisfy the relations 


qjo¥ =Wopj, pioW@ =—-W qj, 


consistent (modulo the difference in normalisations of the Fourier transform) with 
the relations xjo. ¥ = F o(40;) and (40;)o.¥ = —F ox; for position and mo- 
mentum operators of Section 15.5.b. 


Appendix A 


Zorn’s Lemma 


Zorn’s lemma provides a sufficient condition for the existence of maximal elements in 
partially ordered sets. Its formulation uses some terminology which we introduce first. 
A relation on a set S is a subset R of the cartesian product S x S. Instead of (x,y) € R 
we often write xRy. 


Definition A.1 (Partially ordered sets). A partially ordered set is a pair (S,<), where S 
is a set and < is arelation on S such that for all x,y,z € S we have: 


(i) (reflexivity) x < x; 
(ii) (antisymmetry): if x < y and y < x, then x = y; 
(iil) (transitivity): if x < y and y < z, thenx < z. 


A totally ordered set is a partially ordered set (S,<) with the property that for all x,y €S 
we have x < y or y < x (or both, in which case x = y). 


Definition A.2 (Maximal elements, upper bounds). Let (S,<) be a partially ordered 
set. An element x € S is said to be maximal if x < y implies y = x. An element x € S is 
said to be an upper bound for the subset S’ C S if x’ < x holds for all x’ € 8”. 


Assuming the Axiom of Choice, one has the following result. 


Theorem A.3 (Zorn’s lemma). Jf (S,<) is a nonempty partially ordered set with the 
property that each of its totally ordered subsets has an upper bound in S, then S has a 
maximal element. 
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Appendix B 


Tensor Products 


Let V and W be vector spaces and let @(V,W) denote the vector space of all bilinear 
mappings from V x W into the scalar field KK, that is, all mappings ¢:V x W > K 
satisfying 


o(cv,w) = o(v,cw) = cd(v,w) 
for all c € K, v € V, and w € W, and 
o(v+vw) = o(v,w)+o(v,w), 
o(v,w+w’) = (v,w) + o(v,w’) 


for all vy, € V and w,w’ € W. 
For all v € V and w € W, the mapping 


VOw: G++ O(v,w) 


defines an element of A(V,W)", the vector space of all linear mappings from A(V, W) 
to K. Note that 


c(v@w) = (cv) @w =v@ (cw) 
for all c € K, v € V, and w € W, and 


(v+v)@w=v@wt+v @w 
v@(w+w) =v@w+vew 


for all vy,’ € V and w,w’ € W. 
Definition B.1 (Algebraic tensor product). The (algebraic) tensor product 
Vow 


of V and W is the linear span in A(V,W)* of the set {v@w: ve V, we Wh. 
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624 Tensor Products 
We have natural isomorphisms of vector spaces 
K®V2-V@eK-eV. 


By the above definition and the identities preceding it, every element of V ® W admits 
a representation as a finite sum Yi vj Qwj. 

If the (finite or infinite) sets {v;: i€ 7} and {w; : i € J} are both linearly independent, 
so is the set {v; @w; : i € I}. Indeed, suppose that yea cjVi; @ wi; = 0 for certain k > 1 
and scalars c),...,cg. For @ € V' and y € W' the mapping ¢ : (v,w) 4 o(v)w(w) 
belongs to A(V,W) and accordingly 


k k 


0= (Lemom)@= yi cjS( Vij Wi;) = Yeo Vi) W(wi,) = V(b €10(vi) i). 


j=l j=1 


This being true for all y € W’, the linear independence of {w1,...,wx} implies that 
(cjvi,) = ¢j9(vi,) = 0 for all @ € Vi and j = 1,...,k. But this implies that cjvj;, = 0 
for all 7 = 1,...,k. The linear independence of {v1,...,v,} implies that Vij + 0 for all 
j=1,...,k, So we must have c; = 0 for all j = 1,...,k. This proves our claim. 

As a corollary to this observation we chan that if V and W are finite-dimensional, 


with bases (v i) , and (w,)au 1» then (vj @ w;)™ I’ is a basis for V ® W. In particular, 


dim(V @ W) = dim(V) dim(W). 


ij= 


For vector spaces U,V, W, the mapping 
(u®v) @we+ u®(v@w) 
uniquely extends to an isomorphism of vector spaces 
(USV)@W~UR(VeW). 


Stated differently, taking tensor products is associative. This allows us to define the 
tensor product 


USVEAW 
as either one of the spaces in this isomorphism; for the sake of concreteness we will 


use the space on the left-hand side. With this in mind we can define the tensor product 
Vi ®---@ Vy of vector spaces V,,n = 1,...,N, inductively by 


Vi @:-:-@Vy i= (Vi @---@Vy_1) @ Vw. 


Alternatively one could define V; ® ---® Vy in terms of functionals on the space of 
N-linear mappings; the resulting space is isomorphic in a natural way to the one just 
defined. In what follows we write V®”" := V @---@V for the n-fold tensor product of V. 
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The n-fold symmetric tensor product T"(V) is defined as the range of the projection 
Pr on V®" given by 


1 
Privy @-+-@vy,h A y Va(1) Or @Van),  Viy-- Vn EV, 

* HESn 
where S,, is the group of permutations of {1,...,}. Likewise one defines the n-fold 
antisymmetric product A"(V), also known as the n-fold exterior product, of a vector 
space V as the range of the projection on V®” given by 

1 : 
Pri vp@++* @Vp ro a 3 sign (2) V_(1) ® +++ @Va(n), V1,---,¥n EV. 


! 
* TESy 


In connection with second quantisation, these spaces are sometimes denoted by V®” 
and V®", respectively. 

We conclude with the observation that if S and T are linear operators on the vector 
spaces V and W, respectively, then 


(S@T):v@we Sv®Tw 


uniquely defines a linear operator $@T on the tensor product V ® W. To see that this 
operator is well defined, suppose that an element in V ® W admits two representations 


N N' 
> CnaVn ® Wr = > CrVn © Wy 
n=1 n=1 


If ¢: V x W > Kis bilinear, then the mapping ¢s57 : V x W — K given by 
os.r(v,w) = 0(Sv, Tw) 


is bilinear and 


N 


(2 sv, ® Twn) (?) = » Cn (SV¥n, TWr) 


n=1 
N 
= ry CnOs,T (Vn, Wn) = (x CnVn ® wn) (9s, T) 
n=1 n=1 


and, by the same argument, 


(Lasyorw,)io) = (Le vy, QW, .) (Os). 


It follows that 


(x CrSVn ® Twn) () = (5 cy, SV, @ Twi.) (0). 


n=1 n=1 
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This being true for all bilinear @ : V x w > K, it follows that 


N N’ 
ae CpSVn @ TWy = » c Sv, @ Twi, 


n=1 n=1 


Appendix C 
Topological Spaces 


This appendix offers a brief treatment of topological spaces. Only those notions are 
covered that find their way into the main text. Several others, such as convergence, will 
only be needed in the more concrete setting of metric spaces and will be discussed in 
that context. 


Definition and General Properties 


Definition C.1 (Topological spaces). A topological space is a pair (X,T), where X is a 
set and T is a topology on X, that is, T is a collection of subsets of X with the following 
properties: 


Gi) @e€tandX €T; 
(ii) T is closed under taking arbitrary unions; 
(ili) T is closed under taking finite intersections. 


A subset U of X is said to be open if U € T, and closed if its complement is open. The 
interior S° of a subset S is the union of all open subsets U in X contained in S. The 
closure § of a subset S is the intersection of all closed subsets F of X containing S. Note 
that S° is the largest open subset of X contained in S and S is the smallest closed subset 
of X containing S. A set S is dense in a closed set S’ if S = 8’. 

If @ is a collection of subsets of X, the topology generated by @ is the intersection 
of all topologies on X containing @. The topology generated by a collection @ is the 
smallest topology containing every element of @. 


In what follows we often omit the topology Tt from our notation and write X instead 
of the more cumbersome (X,7) to denote topological spaces, except in those situations 
where confusion could arise. In such situations we may speak of t-open and t-closed 
sets instead of open and closed sets in order to emphasise the role of T. 


627 


628 Topological Spaces 


A topological space X is said to be Hausdorff if for every two distinct points x; ,x2 € X 
there exist disjoint open sets U;,Uz € T such that x; € U; and x2 € Up. 


Proposition C.2. Finite subsets of a Hausdorff topological space are closed. 


Proof Since finite unions of closed sets are closed it suffices to prove that every sin- 
gleton {x} in a Hausdorff space X is closed. For any y € X \ {x} choose an open set Uy 
such that y € Uy and x ¢ Uy. This is possible by the Hausdorff assumption (and actually 
uses less than that). We then have C{x} = Usex\ {3 Uy, and this set is open since T is 
closed under taking arbitrary unions. It follows that {x} is closed. 


An important class of Hausdorff topological space is the class of metric spaces; they 
are discussed in more detail in Section D. Further examples relevant to Functional Anal- 
ysis are Banach spaces with their weak topology, dual Banach spaces with their weak* 
topology, and spaces of bounded operators acting between Banach spaces with their 
strong and weak operator topologies. For their definitions we refer to the main text. 


Continuity 


Let (X, tx) and (Y,ty) be topological spaces and consider a mapping f : X — Y. We 
call f continuous at the point x9 € X if for every open set V € ty containing f (xo) there 
exists an open set U € Ty containing xo such that f(U) C V. We call f continuous if f is 
continuous at every point of X. As an immediate consequence of the definition we note 
that if (X, ty), (Y, ty), (Z, tz) are topological spaces and f : X — Y is continuous at the 
point x9 € X and g: Y — Z is continuous at the point f(x) € Y, then the composition 
go f :X — Z is continuous at the point xo € X. In particular, the composition of two 
continuous mappings is continuous. 


Proposition C.3. Let (X, tx) and (Y, ty) be topological spaces. For amapping f :X > 
Y the following assertions are equivalent: 


(1) f is continuous; 
(2) f-'(V) is open for every open subset V of Y; 
(3) f-'(F) is closed for every closed subset F of Y. 


Proof (1)=(2): Suppose that f is continuous and let V be an open set in Y. Let 
x € f-!(V) be arbitrary. Using the definition of continuity we choose an open subset 
U, € Tx containing x such that f(U,) C V. This means that U, C f~!(V). It follows 
that f-!(V) = Uxe f-1 (vy Ux, and this set is open since T is closed under taking arbitrary 
unions. This shows that f~!(V) is open in X. 


(2)=(3): Suppose f~!(V) is open for every open V CY. Let F CY be closed. 
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Then its complement CF is open in Y, hence by our assumption f—! (CF) is open. From 
f-\(F) =Cf-' (CF) it follows that f-!(F) is closed. 

(3)=(2): This is proved in the same way, interchanging the roles of ‘open’ and 
‘closed’. 


(2)=(1): Let x € X be arbitrary and let V C Y be open and contain f(x). The set 
U = f~!(V) is open in X by assumption, x is an element of this set, and we have 
f(U) CV. Thus f is continuous at the point x. 


Compactness 


Let X be a topological space and let S be a subset of X. A collection Y of open subsets of 
X is called an open cover of S if SC Uyew U. A subcover is a cover Y' of S contained 
in Y. The set S is called compact if every open cover of S has a finite subcover. A set is 
called relatively compact if its closure is compact. 


Proposition C.4. Let X be a topological space. Then: 


(1) every closed subset of X contained in a compact subset of X is compact; 


(2) if X is Hausdorff, then every compact subset of X is closed. 


Proof (1): Let the closed set F be contained in the compact subset S of X. Let Up 
be an open cover of F, and extend it to an open cover Y of S by adjoining the open 
set CF. The resulting cover of S has a finite subcover, and this subcover also covers F’. 
Removing the set CF from this subcover, we are left with a finite subcover of Y for S. 
It follows that F is compact. 

(2): Let S be a compact subset of the Hausdorff space X. We first claim that for 
every x € CS there is an open set U; containing x and disjoint from S. Indeed, for every 
y € S, the Hausdorff property provides us with two disjoint open sets U,, and U,y 
such that x € U,y and y € U,y. The open cover % = {Uy : y € S} of S has a finite 
subcover, say %! = {Uxy,,---,Ux,,, }, where k, > | is an integer depending on x. The 
set Uy := ea , Uxy; is open, contains x, and is disjoint from S. This proves the claim. 
But now we see that CS = Urets Ux, 80 CS is open and S is closed. 


A collection of subsets of a topological space has the finite intersection property if 
every finite subcollection has nonempty intersection. 


Proposition C.5. A nonempty closed subset S of a topological space X is compact if 
and only if every collection of closed subsets of S with the finite intersection property 
has nonempty intersection. 
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Proof ‘Only if’: Let @ be a collection of closed subsets of S having the finite inter- 
section property. If we had ceg C = @, then Y := {CC : C € @} is an open cover of 
S without a finite subcover. For if CC,,...,0C, were to cover S, then C]N--- NC, = @. 
This contradicts our assumption. It follows that @ has nonempty intersection. 


‘If’: Reasoning by contradiction, assume that every collection of closed subsets of 
S with the finite intersection property has nonempty intersection and assume that there 
exists an open cover YW of S without finite subcover. Then for any finite choice of sets 
Uj,...,U, € Y we have S\Uia U; # @. It follows that M=1(SNCU;) # ©. From the 
assumption on S we infer that Qyeg,(SNCU) A @. But then Y does not cover S and 
we have arrived at a contradiction. 


Compactness is preserved under continuous mappings: 


Proposition C.6. Let X and Y be topological spaces. Let f : X — Y be a continuous 
mapping. If S is a compact subset of X, then f(S) is compact in Y. 


Proof Let ZY be an open cover of f(S). Then {f~!(U) : U € Y} is an open cover of § 
by Proposition C.3. Since S is compact, it has a finite subcover { f—!(U,),...,f~!(Un)}. 
The collection {U;,...,U,} is then a finite subcover of f(S). 


Every continuous function f : [a,b] + R has a global maximum and a global mini- 
mum on [a,b]. More generally we have: 


Theorem C.7 (Global maxima and minima). Let X be a compact topological space and 
let f :X — R be continuous. Then f attains a global maximum and a global minimum. 


Proof We prove that f attains a global maximum; by applying this to the continuous 
function —f it follows that f also attains a global minimum. 

For n> 1 let U, = {x EX: f(x) <n}. The collection Y = {U, : n> 1} is an 
open cover of X and has, thanks to the compactness of X, a finite subcover. From this it 
follows that the range of f is bounded above. Let m := sup{ f(x): x€ X}. 

Suppose that there is no x € X such that f(x) =m; we show that then X cannot be 
compact. The assumption just made implies that the collection ¥ = {V, : n> 1} is an 
open cover of X, where V, := {x EX: f(x) <m-— 1}. Since for every n > | there is an 
x € X such that f(x) >m— 7 (this follows from the definition of the supremum) ¥ has 
no finite subcover. 


A topological space is called normal if for any two disjoint closed subsets F and G 
there exist disjoint open subsets U and V such that F CU andGCV. 


Proposition C.8. Every compact Hausdorff space X is normal. 


Proof Let F and G be disjoint nonempty closed subsets of the compact Hausdorff 
space X. Then F and G are compact by Proposition C.4. Fix a point x € F. Since X is 
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Hausdorff, for all y € G there exist disjoint open subsets U;,y and V,,y such that x € U,y 
and y € V,y. By letting y range over all points of G and using compactness we find an 
open cover Vy.y,,..- Va veg of G. Set U, := ean Uxy; and V, := a Viy;° Then x € U,, 
G CYV,, and U,NV, = ©. Letting x range over F and using compactness we find an open 
cover U,,,...,U,, of F. The sets U := Ufa Ux, and V := aa Ve, are open and satisfy 
FCU,GCV,andUNV=2. 


In the next appendix we will see that also every metric space is normal. 


Corollary C.9. Let X be a normal space. If F © U C X with F compact and U open, 
then there exists an open set V such that 


PAE CVC 


Proof By normality there exist disjoint open sets W and W’ such that F C W and 
CU CW’. Then 
F COw’ clw' cu. 


The set V :=CW’ satisfies F CV C V COW’ CU, where the third inclusion holds since 
CW’ is a closed set containing V. 


Urysohn’s Lemma 


In normal spaces, disjoint closed sets can be separated by continuous functions. This is 
the content of the next result. 


Proposition C.10 (Urysohn’s lemma). Let X be anormal space. If F CU CX with F 
closed and U open, then there exists a continuous function f : X — [0,1] such that f =1 
on F and supp(f) C U. 


Proof A rational number q € [0, 1] is called dyadic if it is of the form ae where k,n € N 
and 0 <k < 2”. We will construct, for every dyadic q € [0,1] an open set U;, such that 


FPCU,CU;CU 
and, for dyadic r € [0, 1], 
q >r implies U, C U,. 


By Corollary C.9 (applied twice) there exist open sets Up and U; such that 


FCU, CU, CU) CUCU. 


Reasoning by induction, suppose that for some n € N and k = 0,...,2” the open sets 
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Ux have been chosen such that g > r implies U, C U,. Using Corollary C.9, for all 


2 
k=0,...,2” —1 we find an open set U241 such that 
net 


Uist CU x41 CU x CUR 
ou Sal pnt m 


Then U, C U, holds for all dyadic q > r of the form =4> with 0 <k < 2"*1. 


Now define 
q, ifxeUg, 1, ifxeU,, 
Sq(x) = r(x) = s 
0, otherwise, r, otherwise, 


and put f(x) := sup, fq(x) and g(x) := inf, g,(x). Then f is lower semicontinuous, g is 
upper semicontinuous, 0 < f < 1, f =1 on F, and supp(f) C U. To conclude the proof 
we show that f = g. 

If fa(x) > g,(x), then we must have q > r, x € Ug, and x ¢ U,. But q > r implies 
U, C U,. This contradiction shows that fj(x) < g-(x) for all dyadic g,r € [0,1] and 
x €X. This implies f < g. 

If f(x) < g(x), there are dyadic numbers g,r € [0,1] such that f(x) <r<q< g(x). 
But f(x) <r implies that x ¢ U, and g(x) > q implies x € U,. This contradicts the fact 
that q > r implies U, C U,. It follows that f = g. 


The support of a continuous function f : X — K, where X is a topological space, 
is defined as the complement of the largest open set U C X such that f = 0 on U or, 
equivalently, as the set closure of the set {x € D: f(x) 4 0}. 


Theorem C.11 (Partition of unity). Let X be anormal space and let 


FCU,U---UU,, 


where F is compact and the sets U; are open in X for all j =1,...,k. Then there exist 
nonnegative continuous functions f; : X — [0,1] with support in Uj, j =1,...,k, such 
that 


fit: +f =lonF. 
The same result holds if X is a locally compact Hausdorff space. 


Proof Every x € F is contained in at least one of the sets U;, and applying normality 
to the closed sets {x} and CU; we find an open subset V containing x and whose closure 
is contained in U;. Letting x range over F and using that F is compact, it follows that 
we can cover F with finitely many open sets V;,...,V, such that for all m= 1,...,n we 
have V,, C Uj,, for some 1 < jim < k. Set 
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This set is closed and contained in U;. By Urysohn’s lemma we can find continuous 
functions g; : X — [0, 1] topologically supported in U; such that g; = 1 on Fj. Put f, := 
gi and 


fj:= (1-81): —gj-1)8j, fj =2,---,k. 


The support of f; is contained in U; and an easy induction argument shows that 


fiteot+fe=1— (1-91): (1— x). (C.1) 


If x € F, then gj(x) = 1 for at least one j = 1,...,& and therefore (C.1) implies that 
fit: +f =lonF. 

The case of locally compact Hausdorff spaces may be reduced to the case of compact 
Hausdorff spaces by the same argument as in Proposition 4.3. 


We conclude with a useful extension theorem. Its proof makes use of the fact, men- 
tioned in Section 2.2.a, that the space C,(X) of all bounded continuous functions on a 
topological space X is complete as a normed space endowed with the supremum norm. 
The proof of this elementary fact is direct and does not introduce circularity in our 
arguments. 


Theorem C.12 (Tietze extension theorem). Let F be a closed subset of a normal space 
X and let f : F — [0,1] be continuous. Then there exists a continuous function g : X > 
[0, 1] such that g|r = f. 


Proof ThesetsA:={xeF: f(x) €[0,4]} and B:= {xe F: f(x) € [§, 1]} are disjoint 
and closed in F, and hence closed in X (since F is closed in X). By Urysohn’s lemma 
there exists a continuous function g; : X — [0,1/3] such that g} =0 on A and g; = ; 
on A. This function satisfies O< f—g1 < 5 pointwise on F’. Proceeding inductively, we 
construct continuous functions g, : X > (0, 2-1 / 3°), k > 1, such that for every k > 1 
we have 


k-1 
gx, =0 on the set {re F: f(x)- ¥. g(x) < 2k-1 j3kI 
j=l 
and 


gx = 2''/3* on the set {re F: f(x)- Y" gi(x) > Dee 
j=l 


We then have 0 < f — re Bi 2k / 3% pointwise on F; the lower bound is clear from 
the construction and the upper bound follows by induction. 
Set 2 := Yes1 gx. The partial sums of this sum converge uniformly and therefore g is 
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continuous, by the completeness of C,(X). On the set F we have 


k 
O0<f-g<f-Yaj< 2/3 
j=l 


for every k > 1, forcing that f = g on F. 


Tychonov’s Theorem 


Let / be a nonempty set and suppose that for every i € J a topological space (Xj, 7;) is 
given. The cartesian product of the family (X;)je7 is the set X = [];<, X; whose elements 
are the mappings x: ] > Uj<, X; with the property that x(i) € X; for alli € J. For each i € 
we define the coordinate mapping p; : X — X; by p;(x) :=x(i). The product topology 
of X = [];<,X; is the topology generated by the sets p; !(U;), where U; ranges over 
all open sets in X; and i ranges over J. It is the smallest topology t = []j<; % with the 
property that all coordinate mappings p; : x > x(i) are continuous as mappings from 
X into X;. If 7 = {i1,...,i,} is finite, the topology of X = Th-1 Xi, coincides with the 
topology generated by the sets of the form U = Uj, x --- x Uj, with Uj; open in Xj for 
all j=1,...,k. 


Theorem C.13 (Tychonov). The product of any family of compact spaces is a compact 
space. If each one of the spaces is Hausdorff, then so is its product. 


Proof Let X =J]jc; Xi, where (X;,7;) is a compact topological space for each i € J. If 
X; = © for some i € I we have X = © and there is nothing to prove. We may therefore 
assume that X; 4 © for alli € J. 

Fix a collection @ of closed subsets of X with the finite intersection property. We 
wish to prove that \cc¢C # W. Once this has been proved, Proposition C.5 implies 
that X is compact. 

Let D be the set of all collections F of subsets of X which have the finite intersection 
property and contain @ as a subcollection. The set D is nonempty (it contains @) and can 
be partially ordered by set inclusion, that is, we declare Y < FY’ to mean that J C F. 
Note that we do not insist on closedness of the sets in J. 

Let T CD be a totally ordered subset, that is, a subset with the property that for all 
FZ, AZ, €T we have either A C Aor AC H. We claim that User TF belongs to D. 
For this it suffices to check that this union has the finite intersection property. To this end 
suppose that 7),...,T € Uger 7, say Tj € Fj €T for j =1,...,k. Since T is totally 
ordered, after relabelling we may assume that .7 C--- C .%. Then every 7; belongs to 
&H and therefore the finite intersection property of % implies that am T; # ©. This 
proves that U ze J has the finite intersection property. 

Evidently, the union U 7<7 7 is an upper bound for T in D. We may therefore apply 
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Zorn’s lemma and obtain that D has a maximal element. We denote it by .@. For each 
i € I consider the collection 


Xi = {pi(M):Me 7}, 
where p; : x ++ x(i) are the coordinate mappings. It consists of closed subsets of X; 
and has the finite intersection property since .@ has it. Since X; is compact, the set 
Yi := (ue apilM) is nonempty by Proposition C.5. Using the Axiom of Choice, for 
every i € J choose a y; € ¥; and let x € X be defined by x(i) := y;, i € J. We will prove in 
two steps that x € M for all M € -@. 


Step I — If U; is open in X; and contains x(i) = y;, the fact that x(7) € p;(M) for all 
M €.M implies that x(i) € pj(M)NU; and hence x € MM p; |(U;) for all ME @. It 
follows that the collection.“ U{p; '(U;) : i € 1} has the finite intersection property and 
belongs to D. By maximality, this collection equals .@. Therefore pr (U;) € @ for all 
iel. 

Step 2 — Let U be an open set in X containing x. By the definition of the product 
topology there are indices i),...,ix € J and open sets Uj, € %, for j = 1,...,k such 
that x € am 1 Dj," (Ui,) C U. By the result of Step 1 and the fact that @ has the finite 


intersection property we have Mp; Ui yan np; (Ui,) # @ for all M € .@. In 
particular, MMU # @ for all M € -@. This being true for all open sets U containing x, 
it follows that x € M for all M € 7. 


It now follows that 


xE () MC () E= () Cc, 
MCm Cee Cee 

where we used that @ C .@ and the fact that the elements of @ are closed sets. Therefore 
Ncee C # © and we conclude that X is compact. 

Suppose now that each space X; is Hausdorff. If x,x’ € X and x 4 x’, then for some 

i € I we must have x(i) # x’(i) and since X; is Hausdorff there are disjoint open sets Uj 

and U/ in X; containing x(i) and x’(i), respectively. Their inverse images under 7; are 

open and disjoint in X. 


Appendix D 
Metric Spaces 


We now introduce an important class of Hausdorff spaces, namely, the class of metric 
spaces. All results of the previous appendix apply to metric spaces, but in order to make 
the present appendix independently readable some proofs are repeated. In addition, our 
treatment of metric spaces includes a number of additional topics. 


Definition and General Properties 


Definition D.1 (Metric spaces). A metric space is a pair (X,d), where X is a set and d 
a metric (or distance function) on X, that is, a function d : X x X — [0,°¢) such that for 
all x, y, zin X the following conditions are satisfied: 


(i) d(x,y) =O@x=y; 
ii) d(x,y) = d(y,x); 
(iii) d(x,z) < d(x,y) +d(y,z) (the triangle inequality). 


In what follows we omit the distance function d from the notation and write X instead 
of the more cumbersome (X,d) to denote metric spaces, except in those situations where 
confusion could arise. 

Let X be a metric space, let x € X, and let r > 0. The set 


B(xsr) :={y EX: d(x,y) <r} 
is called the open ball with centre x and radius r. The set 
B(x;r) :={y EX: d(x,y) <r} 


is called the closed ball with centre x and radius r. 

A subset S of a metric space X is called open if for all x € S there exists an r > 0 
such that B(x;r) C S. A subset S of a metric space X is called closed if its complement 
CS = X\S is open. It is an easy consequence of the triangle inequality that every open 
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ball B(x;r) is open and every closed ball B(x; r) is closed; this justifies the terminology 
‘open ball’ and ‘closed ball’. 

The interior of a subset S of a metric space X is the union of all open subsets of X 
contained in S and is denoted by S°. It is the largest open subset of X contained in S. 
The closure of a subset S of a metric space X is the intersection of all closed subsets of 
X containing S and is denoted by S. It is the smallest closed subset of X containing S. 

The closure B(x;r) of an open ball is always contained in the closed ball B(x;r), but 
this inclusion may be strict. For example, take X = Z with distance function d(m,n) = 
|n — m|. The open ball B(0; 1) = {0} is also closed, so its closure equals B(0;1) = {0}. 
On the other hand, B(0; 1) = {—1,0, 1}. 

For any metric space (X,d), the collection of its open sets is a Hausdorff topology, 
the so-called Borel topology of X. More is true: 


Proposition D.2. Every metric space is normal. 


Proof Let F and G be disjoint closed sets in a metric space X. We must find disjoint 
open set U and V such that F C U and G CV. We may assume that F and G are both 
nonempty, since otherwise the result is trivial. 

The function 
d(x, F) 


FO) = TGF) +d,6) 


is well defined, continuous, takes values in [0,1], and satisfies f = 0 on F and f = 1 
on G. The sets U := {x € E: f(x) < 3} and V:= {x € E: f(x) > 5} are open by 
Proposition C.3 and have the desired properties. 


Convergence 


A sequence (x,)n>1 in a metric space X is called convergent if there exists an x € X such 
that for all € > 0 there is an index N > 1 with the property d(x,,x) < € foralln >N. 
We then write 

lim x, =x 

n—-yoo 
and call x a limit of the sequence (x,)n>1. It is clear that lim,_,..%, =x if and only if 
limy—s00 (Xn, x) = 0. Limits are unique, for if lim, _,..%, =x and limy-,..X, = y, then for 
all indices n the triangle inequality gives 0 < d(x,y) < d(x,x») +d(%n,y) 7 0+0=0 
and therefore d(x,y) = 0. 

A subset S of a metric space X is called sequentially closed if the limit of every 

sequence in S that converges in X belongs to S. 
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Proposition D.3. For a subset S in a metric space X, the following assertions are equiv- 
alent: 


(1) Sis closed; 
(2) S is sequentially closed. 


Proof (1)=(2): Let (%1)nz1 be a sequence in S, convergent in X with limit x. We 
must show that x € S. Assume the contrary. Then x € CS (the complement of S), and 
because CS is open there is an € > 0 such that B(x;€) C CS. On the other hand, since 
(Xn)n>1 Converges to x there is an index N > 1 such that d(x,,x) < € for all n > N, 
that is, x, € B(x;€) (and hence x, € CS) for all n > N. For these indices we obtain the 
contradiction x, € SNCS. 


(2)=(1): We must show that CS is open. Choose x € CS arbitrarily. We must show 
that there is an € > 0 such that B(x;€) C Cs. Suppose that such an € > 0 does not exist. 
Then for every n > 1 we can find an x, € B(x; 1) OS. The resulting sequence (Xn)n>1 
is contained in S and satisfies d(x;,x) < 1 for all n > 1, that is, we have limy 3.0%, = 
x. Because S is sequentially closed we conclude that x € S, in contradiction with the 
assumption that x € CS. 


As acorollary we have the following useful criterion for determining which elements 
belong to the closure of a given set: 


Proposition D.4. For a subset S in a metric space X and a point x € X, the following 
assertions are equivalent: 


(1) xES; 
(2) SNB(x;€) 4 @ for all € > 0; 


(3) there exists a sequence (Xn)n>1 in S with limp+.Xn = X. 


In particular, a set S is dense in the closed set S’ if and only every s' € S is the limit of a 
sequence in S. 


Proof (1)=>(2): If SOB(x;€) = @ for some € > 0, then S C CB(x;€). Because CB(x; €) 
is closed, this implies S C CB(x; €) and therefore x ¢ Ss. 

(2)=(3): For every n > 1 we choose x, € SN B(x; t), In this way we obtain a se- 
quence (x;)n>1 in S converging to x. 


(3)=(1): If x ZS, then x € CS and this set is open. Hence there exists an € > 0 such 
that B(x;¢) C CS. In particular it holds that d(x,y) > € for all y € S. This implies that no 
sequence in S can converge to x. 
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A sequence (x,;)n>1 in a metric space X is called a Cauchy sequence if for all € > 0 
there is an index N > 1 such that d(%,,X%m) < € for alln,m> WN. 


Proposition D.5. In a metric space X the following assertions hold: 


(1) every convergent sequence is a Cauchy sequence; 
(2) every Cauchy sequence with a convergent subsequence is convergent. 


Proof (1): Let x be the limit of the convergent sequence (x,)n>1 and let € > 0 be 
arbitrary. Choose N so large that d(x;,,x) < € for all k > N. By the triangle inequality, 
for all n,m > N it follows that d(xn,X%m) < d(Xn,x) +d(x,%m) < €+€ =2eE. 

(2): Let (%n)n>1 be a Cauchy sequence with a subsequence (Xp, )4>1 Convergent to x. 
We check that (%,)n>1 converges to x. Choose € > 0 arbitrarily and let N > 1 be such 
that d(Xn,Xm) < € for all m,n > N. Let K > 1 be such that for all k > K we have both 
ny > N and d (Xn, 1X) < €. Now choose k > K arbitrarily. Then for all n > N we have 
d(Xn,X) < d(Xn,Xn,) +d (Xn, .x) < E+€ =2e. 


A subset S of a metric space X is called com- 
plete if every Cauchy sequence contained in S 
converges to a limit in S. In particular, X is com- 
plete if every Cauchy sequence in X is conver- 
gent in X. 


Theorem D.6 (Completion). Jf X is a met- 
ric space, there exists a complete metric space 
(X,d) and a mapping i: X > X with the follow- 
ing properties: 


” 


(i) iis isometric, that is, d(ix,iy) = d(x,y) for Augustin-Louis Cauchy, 
all x,y € X; 1789-1857 

(ii) i has dense range, that is, i(X) is dense in 
X. 


Moreover, if (X d) is another complete metric space and i’ : X > X isa mapping satis- 
fying (i) and (ii), then the identity mapping on X has a unique extension to an isometry 
from X onto X. 


A more precise way of stating the last assertion is that there exists a unique isometry 
j from X onto X which satisfies j(ix) = ix for all x € X. 


Proof On the set of Cauchy sequences in X we define an equivalence relation by 
declaring the Cauchy sequences (Xn)n>1 and (x/,)n>1 equivalent if limp. d(%n,x),) = 0. 
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Let X be the set of all equivalence classes. Denoting the equivalence class of a sequence 
(Xn)n>1 by x, on X we define a metric d by 
d(x,y) := lim d(Xn,Yn)- 
n—yoo 
Using the triangle inequality it is readily checked that this limit indeed exists and is 
independent of the choice of sequences (Xn)n>1 and (yn)n>1 representing x and y. 

If x,y € X, then the constant sequences (x),>1 and (y),>1 are Cauchy and therefore 
define elements of X, which we shall denote by x and y. We obtain a mapping i: X + X 
by declaring ix := x. From d(ix, iy) = d(x, ¥) = limy_5..d(x,y) =d(x,y) we see that this 
mapping is isometric. 

This gives property (i). To prove property (ii) let x € X, and let (x,),>1 be a Cauchy 
sequence in X in its equivalence class. For any € > 0 we may choose N so large that 
d(Xn,Xm) < € for all m,n > N. Then d(x, ixy) = limy 5.0 d(Xn,xN) < €. This shows that 
i(X) is dense in X. 

Next we prove that X is complete. Suppose (%,),>1 is a Cauchy sequence in X. By 
the density of X in X we may pick elements x, € X such that d(X,,Xn) < i here we 
identify elements of X with elements of X by means of the mapping 7. From 


d(Xn,Xm) = Asus) < Ficore es +d(Xn,Xm) +d(Xm,Xm) 


— 


= 1 
< —+d(n,Xm) + — 
n m 


and the fact that the middle term tends to 0 as n,m — o, we infer that (x,)n>1 is a 
Cauchy sequence in X. Let x € X be its equivalence class. As was shown in the proof of 
density, we have lim) 30. d(x,,%) = 0. Then 


= = | aoe 
d(Rn, X) < d(Xn,Xn) +d (Xn, X) < 7 tans %) 


shows that lim, _,..d (Xn,X) = 0. This proves the completeness of X. 
Let J: X — X denote the identity mapping on X, and let (Xx d) and i! : X + X satisfy 
(i) and (ii). We obtain an isometry 7 : X > X by putting 


1X := lim Ix, = jim n Xn; 
n-oo 


where (Xn)n>1 is a Cauchy sequence representing X and the limit on the right-hand 
side is taken in X. This limit exists because the sequence (Xn)n>1 is Cauchy in the 
complete space X. The resulting mapping / is an isometry from X onto X, whose inverse 
is obtained by applying the same procedure with the roles of X and x interchanged. 

If 7 is another isometry from X onto X of the identity mapping on X, then Ix = 
lim, 001%, = Ix for all x € X represented by the Cauchy sequence (x,)n>1 in X. This 
gives the uniqueness of /. 
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A complete metric space (X,d) is called a completion of (Xd) if there exists a map- 
ping i: X — X with properties (i) and (ii). The second part of the theorem asserts that 
completions are “unique up to isometry”. To avoid this minor ambiguity we agree to 
call the metric space (X,d) constructed in the above proof “the” completion of (X,d). 


Continuity 
In the context of metric spaces this definition takes the familiar “e /5”-form: 


Proposition D.7. A mapping f : X — Y is continuous at the point xo € X if and only if 
for every € > 0 there exists a 6 > 0 such that for all x € X with dx(x,xo) < 6 we have 


dy (f(x), f(x0)) < €. 


This is clear from the fact that every open set U in X containing xp contains the open 
balls B(xo;6) for sufficiently small 6 > 0 and every open set V in Y containing f (xo) 
contains the open balls B(f(xo);€) for sufficiently small € > 0. 

A mapping f : X — Y is called sequentially continuous at the point xp € X if for every 
sequence (X»)n>1 in X with limy.%, = xo we have limp so f (Xn) = f (x0). We call f 
sequentially continuous if f is sequentially continuous at every point of X. 


Proposition D.8. For a mapping f : X — Y between the metric spaces X and Y the 
following assertions are equivalent: 


(1) f is continuous at the point xp € X; 
(2) f is sequentially continuous at the point xp € X. 


In particular, f is continuous if and only if f is sequentially continuous. 


Proof (1)=(2): Suppose that lim,-,.%, = xo in X. Choose € > 0 arbitrarily and 
choose, using the continuity of f, a 6 > 0 such that dy (f(x), f(xo)) < € for all x EX 
with dx (x,x0) < 6. Because limy_5.0 dx (%n,x0) = 0 we can find an index N > 1 such that 
dx (Xn,x0) < 6 for all n > N. For all n > N it then holds that dy (f(x), f(xo)) < €. 


(2)=-(1): Suppose that there is an € > 0 such that no 6 > 0 can be found with the 
property that dy (f(x), f(xo)) < € for all x € X with dx (x, x9) < 6. Then for every n > 1 
we can find x, € X with dy (Xn,x0) < 1 and dy (f (xn), f(xo)) = €. Then limp ,..%n = x0. 
Hence, by the sequential continuity of f, limp. f(%) = f(xo). This contradicts the 
fact that dy (f(x), f(x0)) > € for alln > 1. 


A function f : X — Y is uniformly continuous if for every € > 0 there exists a 6 > 0 
such that whenever x,y € X satisfy dy(x,y) < 6, then dy(f(x), f(y)) < €. Every uni- 
formly continuous function is continuous, but the converse is false even for bounded 
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functions: the function f : (0,1) + [-1,1], f(x) = sin(1/x), is continuous but not uni- 
formly continuous. Below we prove that if X is compact, then every continuous function 
from X into another metric space is uniformly continuous. 

Uniformly continuous functions have the following extension property: 


Proposition D.9. Let X and Y be metric spaces, with Y complete. If f :X — Y is 
uniformly continuous, there exists a unique uniformly continuous function f :X > Y 
extending f. 


Proof Let x € X and choose a sequence x, — x with each x, in X. Then (xp)n>1 is 
Cauchy in X, and the uniform continuity of f implies that (f(%))n>1 is Cauchy in Y. 
Since Y is complete, this sequence converges to a limit, say y. We set f(x) := y. We 
need to check that f is well defined (that is, f(x) does not depend on the choice of the 
approximating sequence) and is uniformly continuous. 

If x, > x with each x, in X, then lim)... dy (X%n,%n) = 0. By the uniform conti- 
nuity of f it follows that lim,_,..dy(f(%n),f(%)) = 0, and therefore limy-,.. f(%n) = 
limys00 f (x,). This proves that f is well defined. 

To prove that f is uniformly continuous let € > 0 and choose 6 > 0 as in the definition 
of uniform continuity of f applied with 5€ instead of €. If dy(x,y) < 6 in X, and if 
(Xn)n>1 and (¥n)n>1 are approximating sequences in X, then (f(%n))a>1 and (f(yn))n>1 
are approximating sequences for f(x) and f(y) in Y, and for large enough n we have 
dx (Xn,Yn) < 6 and dy(f(xn),f (vn) < 5€. For any such n we obtain dy (f(x), f(y) = 
limy—oo dy (f (Xn), f(Yn)) < 5€ <E€. 

It is clear from the construction that f extends f. 


Compactness 


Let (X,d) be a metric space. We recall that a subset S of a metric space X is compact 
if every open cover of S has a finite subcover, and relatively compact if its closure is 
compact. In order to characterise compactness in terms of sequences we introduce the 
following terminology. A subset S of a metric space X is called sequentially compact 
when every sequence in S has a convergent subsequence with limit in S. 

A subset S of a metric space X is called totally bounded if for every r > 0 there is a 
finite cover of S with balls of radius r with centres in S. 


Theorem D.10 (Compactness and total boundedness). For a subset S of a metric space 
X the following assertions are equivalent: 


(1) Sis compact; 
(2) S is sequentially compact; 
(3) S is closed, complete, and totally bounded. 
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Proof (1)=(2): Suppose that (x,)n>1 is a sequence in S not containing any subse- 
quence converging to an element of S. We will construct an open cover of S without 
finite subcover. 

The assumption entails that for every x € S there is an €(x) > 0 with the property that 
the open ball B(x; €(x)) contains at most finitely many terms of the sequence (x, )n>1. 
Let Y = {B(x;e€(x)) : x € S}. This is an open cover of S without finite subcover. For 
every U € Y contains at most finitely many terms of the sequence (X,)n>1. But this 
sequence has infinitely many distinct terms: otherwise we could immediately exhibit a 
convergent subsequence. 


(2)=(3): Suppose that S is not totally bounded. We will construct a sequence in S 
without any subsequence converging to an element of S. 

By our assumption there exists an € > 0 such that S has no finite cover with €-balls 
with centres in S. Choose x; € S arbitrarily. The collection {B(x;;€)} does not cover S, 
so there is an x2 € S with x2 ¢ B(x1;€). Note that d(x1,x2) > €. 

The collection {B(x1;€),B(x2;€)} is not a cover of S, so there is an x3 € S with 
x3¢ B(x13€) UB(x0;€). Note that d(x1,x3) > € and d(x2,x3) Se. 

Continuing this way we obtain a sequence (X,)n>1 with the property that d(x;,Xn) > € 
for all choices of n and m. This sequence has no Cauchy subsequence, and therefore no 
convergent subsequence. 

Next we prove that S are complete. Suppose that (x,)n>1 is a Cauchy sequence in 
S. Because S is sequentially compact this sequence has a convergent subsequence with 
limit x in S. But then (x,)n>1 itself converges to x. This shows that S is complete. 


(3)=(1): Suppose, for a contradiction, that S is totally bounded but not compact. 

Since S is not compact, there is an open cover YW of S$ without finite subcover. Since 
S is totally bounded, for every n > 1 we can find a finite cover Y, of S consisting of 
1 balls with centres in S. 

There is a ball B, € A, such that SMB, cannot be covered by finitely many open sets 
in Y. In the same way there is a ball By € #s such that SMB, Bz cannot be covered 
by finitely many open sets in @%. Continuing in this way we find a sequence of balls 
B, € &, such that S71. B,N---OB, cannot be covered by finitely many open sets in Y. 

The sequence of centres (x,)n>1 of these balls is a Cauchy sequence in S. To see this, 
we note that for all n,m > 1 the intersection B, 1B, is nonempty. If x;.,) is an element 
in the intersection, with the triangle inequality we find that 


d(Xn,Xm) < d(Xn,Xnm) +d (Xn, Xm) < i+ 


S[- 


In view of limy myo : + + = 0 our assertion follows. 
By completeness, the sequence (x,)n>1 converges to a limit which belongs to S. 
Choose U € Y such that x € U and choose r > 0 such that B(x;r) C U. Choose N 
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so large that d(xy,x) < sr for alln > N and i < sr. Then 
B,N-+-NBy © By = B(xy; x) C B(xsr) CU. 


But this means that By M--- By is covered by the finite subcollection {U} of Y. This 
contradiction concludes the proof. 


The equivalence of (1) and (3) implies that in a complete metric space, a subset is 
relatively compact if and only if it is totally bounded. The ‘only if’ part is trivial, and 
for the ‘if’ part we note that the closure of a totally bounded set is totally bounded; for 
if S can be covered with finitely many balls of radius B(x,;€) with centres in S, the balls 
B(Xn3;2€) cover S. Now it remains to observe that a closed subset of a complete metric 
space is complete. 


Theorem D.11 (Bolzano—Weierstrass). A subset of R4 is compact if and only if it is 
closed and bounded. 


Proof We have seen that, in any metric space, compact sets are always closed and 
bounded. Suppose, conversely, that the set S is closed and bounded. 


Step I — We prove the theorem for d = | and the interval [a,b]. Let Y be a cover for 
[a,b] with open subsets of R. We must show that Y contains a finite subcover. 

Let us call a point x € [a,b] reachable from a if there is a finite subcollection of Y 
covering [a,x]. Let S be the set of all points that are reachable from a. We must show 
that b € S. 

First we observe that S is nonempty: clearly we have a € S. Since S is bounded above 
(by b) we may put p := supS. Choose U € Y such that p € U. Because U is open, 
there is an € > 0 with (p—€,p+e) CU. Because p = supS we can find an x € S with 
p—€<x< p. Choose a finite subcollection Y’ of Y covering [a,x]. The collection 
U" =U'UL{U} isa finite subcollection of Y covering |a, p]. We conclude that p € S. 
We can also conclude that p = b. Indeed, if we had that p < b, then we could find a 
y€ [a,b] (p,p+é). Then Y” also covers interval [a, y], and it follows that y € S. This 
contradicts the fact that p = supS. 


Step 2 — Suppose now that S C RY is closed and bounded. Because S is bounded, we 
can find an r > 0 such that S C [—r,r]%. We claim that [—r,r]% is compact. Once this 
has been shown, it follows that S, being a closed subset of the compact set [—r, rl, is 
compact. 

To prove that [—r,r]" is compact we show that [—7,r]* is sequentially compact. Let 
(Xn)n>1 be a sequence in [—r,r]X. The N coordinate sequences are sequences in the 
interval [—r,r], which is sequentially compact by the Bolzano—Weierstrass theorem. By 
taking N consecutive subsequences we arrive at a subsequence (Xn, )x>1 all of whose 
coordinate sequences converge in [—r,r]. The sequence (xn, )x>1 then converges in RY, 


with a limit in [—r, 7]. 


N |" 
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Theorem D.12. Let (X,dx) and (Y,dy) be metric spaces with X compact. Every con- 
tinuous mapping f : X — Y is uniformly continuous. 


Proof Let € > 0 be arbitrary. For every x € X we can find a 6(x) > 0 such that for all 
x’ €X with dy (x,x’) < 5(x) we have dy (f(x), f(x’)) < 5€. The collection 


N= {B(x; 55(x)) :xEX} 
is an open cover of X, and therefore has a finite subcover, say 
U! = {B(xj;55(xj)): fHl,...,n}. 
Let 6 =min{56(x;): j=1,...,n}. 


Suppose now that x,2/ € X salily dx (x,x”) < 6. We have x € B(x;;46(x;)) for some 
1 <j <n (since YW’ covers X). Then dy (x,x;) < 56(x;) and 


dx (x',x;) < dy(x’,x) + dx (x,xj)< 6+ 55(x;) < 6(x;). 
Consequently, dy (f(x), f(x’)) < dy (F(x), f(xj)) Fy (f (xj), FQ’) < FE+ FE =E. 


In many applications, the following simple special case of Tychonov’s theorem (The- 
orem C.13) suffices. 


Proposition D.13. [f K,,...,K, are compact metric spaces, then their cartesian product 
K := Ki x--+ x K, is a compact metric space with respect to the product metric 


Proof Given € > 0, for j = 1,...,n choose finitely many open dj-balls of radius e/n 
to cover K;. Their cartesian products are open, contained in d-balls of radius €, and 
cover K. Since € > 0 was arbitrary, this shows that K is totally bounded. Since the 
completeness of the metrics d; implies that d is complete, this proves the compactness 
of K. 


Definition D.14 (Separability). A metric space is called separable if it contains a dense 
countable subset. 


Proposition D.15. Every compact metric space is separable. 


Proof For each n = 1,2,... cover the metric space with finitely many open balls of 
radius i, say BY”, gabe Be. Together, the centres of all these balls form a dense subset. 


Indeed, any nonempty open set U contains an open ball B, say of radius r Zz O, and this 


B” 
j 


otherwise the sets B, we Be cannot cover B. The centres of such balls are in U. 


ball must contain at least one of the balls B\’ for each n > 1 such that | = 47; for 
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Measure Spaces 


This appendix reviews the basic elements of the theory of Lebesgue integration. 


o-Algebras 
Let Q be a set. 


Definition E.1 (o-Algebras). A o-algebra in Q is a collection ¥ of subsets of Q with 
the following properties: 


@) Qe F; 
(ii) F ¢ F implies CF € F; 
(iii) \,F,--- € F implies U,3) Fn € F. 


Here, CF =Q \ F is the complement of F. 

A measurable space is a pair (Q,.F), where Q is a set and ¥ is a o-algebra in Q. 
The sets in ¥ are often referred to as the measurable subsets of Q. 

If @ is any collection of subsets of Q, the o-algebra generated by @ is defined as the 
intersection of all o-algebras containing @ and is denoted by o(@). It is the smallest 
o-algebra containing @. 


The above three properties express that .F is nonempty, closed under taking comple- 
ments, and closed under taking countable unions. From 


(Fn =C(U tr) 
n2>1 n21 
it follows that -¥ is closed under taking countable intersections. Clearly, ¥ is closed 


under finite unions and intersections as well. 
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Example E.2. If (X,7) is a topological space, we denote by 4(X) the o-algebra gener- 
ated by t. This o-algebra is referred to as the Borel o-algebra of (X,7). As an exercise 
the reader may check that 


d 
(R4, A(R") = LIRR); 


for a proof one may use that every open set in R@ is a countable union of rectangles of 
the form [a1,b1) Kr XK [aa,ba). 


Let (Qy,.F1), (Q2, 2), ... be a sequence of measurable spaces. The product 


({] 2n,[] Fn) 


n>1 n>1 


is defined by [],51; Qn := Q1 x Q2 x +++ and T],51 Fn = Ai X F2 X +++ is defined as 
the product o-algebra, that is, the o-algebra generated by all sets of the form 


Fi xX-++X Fy xX Qny1 X Qny2 X-°: 


with N = 1,2,... and F, € ¥, forn=1,...,N. Finite products []_,(Q,,-Fn) are de- 
fined similarly; here one takes the o-algebra in Q; x --- x Qy generated by all sets of 
the form F, x --- x Fy with F, € FY, forn=1,...,N. 

Let f :Q— R be any function. For Borel sets B € @(R) we define 


{f € B} :={@ EQ: f(@) € B}. 
The collection 
o(f)={{f eB}: Be A(R)} 


is a O-algebra in Q, the o-algebra generated by f. 


Measures 
Let (Q,.#) be a measurable space. 


Definition E.3 (Measures). A measure on (Q,#) is a mapping pp : F — [0,00] with 
the following properties: 


(i) U(S) =0; 
(ii) for all disjoint sets F,, F2,... in -F we have U(Uys1 Fn) = Ensi l(Fn)- 


A triple (Q,.F, 1), with a measure on a measurable space (©, .F), is called a mea- 
sure space. 


Measure Spaces 649 


A measure space (Q, F¥, 1) is called finite if u is a finite measure, that is, if 1(Q) < ©. 
If u(Q) = 1, then p is called a probability measure and (Q, F, 11) is called a probability 
space. In probability theory, it is customary to use the symbol P for a probability mea- 
sure. A measure space (Q,.¥,1) is called o-finite if there exist F,,/),... in F such 
that U,,>1 Fr = Q and w(F,) < © for all n > 1. A Borel measure on a topological space 
(X,T) is measure pt : B(X) — [0,00], where B(X) is the Borel o-algebra of X. 

The following properties of measures are easily checked: 


(i) if F| CF in F, then LF) < L(F); 
(ii) if F|,,... in FY, then 


u( U Fn) < )u(F,); 


n>1 n>1 
Gi) fF, CPC... in F¥, then 
u(U Fr) = lim wf): 
n>1 8 
(iv) ifFj D&D... in-F and u(F1) <%, then 
(0) f) = fim w(F). 
n>1 if 


In (iii) and (iv), the limits (in [0,00]) exist by monotonicity. 


Dynkin’s Lemma 


Lemma E.4 (Dynkin’s lemma). Let [1 and Uz be two finite measures defined on a mea- 
surable space (Q,.#). Let &@ C F be a collection of sets with the following properties: 


Gi) QEP; 
(ii) & is closed under finite intersections; 


(iii) the o-algebra generated by &, equals F. 
Tf (A) = Ue(A) for all A € &, then by = Mo. 


Proof Let J denote the collection of all sets D € F with 1 (D) = po(D). Then  C 
J and J is a Dynkin system, that is, 


eX2EGY; 
e if Di C Dg with Di, D2 € J, then also Dz \ Di € J; 
e if D; CDC... with all D, € J, then also U,,5; Dn € F. 
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By assumption we have JY C ¥ = o(a), the o-algebra generated by /; we will 
show that o(./) C Y. To this end let Y denote the smallest Dynkin system in F 
containing 2. We will show that o(.) C %Y. In view of Y C YF, this proves the 
lemma. 

Let @ = {Do € HM: DoNAE H for all A € YH}. This is a Dynkin system and 
a © € since & is closed under taking finite intersections. It follows that Y C @, 
since Zp is the smallest Dynkin system containing .. But obviously, @ C Y, and 
therefore @ = ZY. 

Now let @’ = {Dp € YM: DoNDE * for all D € Yo}. This is a Dynkin system 
and the fact that @ = Y implies that </ CG’. Hence Y C G", since J is the smallest 
Dynkin system containing <7. But obviously, @’ C ZY, and therefore @’ = J. 

It follows that Y is closed under taking finite intersections. But a Dynkin system 
with this property is a o-algebra. Thus, Y is a o-algebra, and now </ C ZY implies 
that also o(./) C Y. 


Outer Measures 


Let S be a set. The power set of S, that is, the set of all subsets of S, is denoted by 
25. Let pt : 2° + [0,c9] be a mapping which satisfies u(@) = 0. A set A C S is called 
L-measurable if 


u(Q)=H(QNA)+u(@NCA) forall QE 2°. 
The collection of all 1-measurable sets is denoted by .Z,. 


Definition E.5 (Outer measures). A mapping v : 25 > [0,09] is called an outer measure 
if 


(i) v(S) =0; 
(ii) A C B implies v(A) < v(B); 
(iii) for all Aj,Az,... € 2° we have 


v( U An) < ¥ v(An). 


n> n2l 
Lemma E.6. Let @ © 2° satisfy @ € © and suppose that wu: € — [0,-| satisfies 
L(@) =0. For subsets A C S define 
yr(A) = int { Y WC): AC Cj, where Cj € for all j > 1} (B.1) 
j21 jl 


with the convention that *(A) = © if the above set is empty. Then y* is an outer 
measure. 
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Proof The mapping p* : 25 — [0,0] clearly satisfies the conditions (i) and (ii) in Def- 
inition E.5. In order to check condition (iii) let Ay,A2,... be subsets of S and let € > 0 
be arbitrary. If 1*(A,) = 0 for some n > 1, then (iii) trivially holds. We may therefore 
assume that *(A,) < © for all n > 1. By the definition of *, for each fixed n > 1 we 
can find C,,; € @ such that 


An©UCnj and Y w(Cyj) < *(An) +2“. 


fel jel 


Then U,,5) An © Un j=1 Cn,j, and, again by the definition of p*, 


n>l nj=l nel nel 


Since € > 0 was arbitrary, this proves the required estimate. 


Theorem E.7 (Measures from outer measures). If v : 25 — [0,09] is an outer measure, 
then My is a o-algebra and v is a measure on (S, M). 


For the proof of the theorem we need the following terminology. A ring in S is a 
subset # of 25 with the following properties: 


(i) SEZ; 
(ii) A,B Ee & implies A\ BE 2; 
(iii) A,B E & implies AUBE &. 
If Z is a ring, the identity AN B = A \ (A \ B) implies that if A,B € Z, then ANBE Z&. 


Proof of Theorem E.7 We proceed in two steps. 
Step 1 — We begin by checking that if 1 : 25 — [0,09] is any mapping which satisfies 
L(S) =0, then .%, is a ring and p is additive on .4,. 
It is clear that @ € .@,. In order to check that .%, is a ring we check the following: 
(a) AEM, implies CA € -%,; 
(b) A,BE.G implies ANB Ee “G. 
Given these properties it is straightforward to check that .%, is a ring. Indeed, this 
follows from the formulas B\\ A = BOCA and AUB =C(CANCB). 
Property (a) is clear. To check (b) let A,B € .4@, and set C:=ANB. Let O€ 25 be 


arbitrary. Observing that ANCB =CCNA and CA = CCNA, and making repeated use 
of the definition of .4%,, we have 


H(Q) = W(QNA) + U(QNCA) 


H(QNANB)+u(QNANECB) + u(QnCa) 
w(Q@NC) + u(@nlCnA) +u(@nClcnCa) 
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= "(QNC)+u(QNCc). 
Therefore, ANB=C EM. 
To check that p is additive on %, fix two disjoint sets A,B € @, and let 0:=AUB. 
Then QNA =A and QNCA =B. Since A € -M,, we find 
Mu (AUB) = H(Q) = W(QNA) + w(@NCA) = w(A) + H(B). 


Step 2 — We now turn to the proof of the theorem. From Step | we know that .Z is 
a ring and v is additive on .Z,. In view of property (a) it remains to check that for any 
disjoint sequence (An)n>1 in Zy, 


A:=(J4An€-G and v(A)= ¥ v(An (E.2) 


n>1 n>1 


Let By, = Uj-; A; for each n > 1. Fix an arbitrary subset Q of S. By Step 1, for alln > 1 
we have CA CCB, By € GM, and 


¥ ¥( v(QNA;) +V(QNCA) = V(ONBn) + v(QNCA) 
= 


v(QN Bn) +V(QNCB,) = v(Q). 
Using the o-subadditivity of v and then passing to the limit n — ©, we infer 


v(@nA)+v(@nlA) < ¥ v(@nA;)+v(@NCA) < v(Q). (E.3) 


j2l 


On the other hand, by subadditivity also the converse inequality v(Q) < v(QNMA) + 
v(Qn CA) holds. This shows that A € -@, and that the inequalities in (E.3) are in fact 
equalities. Now (E.2) follows by taking Q = A in (E.3). 


Carathéodory’s Extension Theorem 


For additive functions  : Z — [0,->] one has the following result. 


Lemma E.8. Let & be a ring and u : & — (0,] be additive, that is, 


n 


mw (Un) = rT) 
pa 


holds for all disjoint sets A,,...,An © &. The following assertions hold: 


(1) ifA,B € @ and ACB, then (A) < u(B); 
(2) ifA1,A2,--- € & are disjoint and \J jz, Aj € &, then 


u(U4i) > Yay) 


jzl jel 
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(3) ifA1,A2,---€ Zand Uj, A; € Z& and yp is countably additive on &, then 
u( U Aj) < Yu (Aj). 
jel j2l 


Proof (1): Writing B= AU(B\A), we see that 
y(B) = w(AU(B\A)) =m (A)+(B\A) > MIA). 
(2): Letn € N. Then U¥_; Aj C Ujs1 4; and therefore, by additivity of w and (1), 
n n 
y H(4)) =2( Ui) <H(UAi)- 
j=l j=l jl 
The result follows by letting n — -». 


(3): The sets By := Aj, By := Az \ Aj, Bz := Az \ (A1 UA2), ... are disjoint and we 
have Uj; Bj =Uj>14;- Therefore, by the o-additivity of 1, 


(Ui) =u(UB;) = VY u(Bj) < Y u(Aj)- 
j21 jel j2l 


jl 


Theorem E.9 (Carathéodory’s extension theorem). Let be a ring in S and suppose 
that U: & — |0,] is countably additive on Z& and satisfies u(@) = 0. Then: 


(1) the outer measure [* restricts to a measure on o(&) extending U; 
(2) if u* is o-finite on o(Z) and if v is another o-finite measure on o(&) extending 
L, then U* =v. 


From Lemma E.6 we know that the mapping yu* defined by (E.1) is indeed an outer 
measure. 


Proof By Theorem E.7, 1* is a measure on the o-algebra .%,+. We prove that it has 
the following properties: 


(i) BO My; 
(ii) w*(A) = uA) for all A € &. 


Clearly, part (1) of the theorem follows from the claim, which actually shows that there 
is a further extension to the possibly larger o-algebra 4+. 

Step 1 —In this step we prove (i). Let A € # and Q C S be given. The subadditivity of 
L* gives u*(Q) < w*(QNA) + u*(QNCA). The converse estimate u*(QNA) +u*(QN 
CA) < w*(Q) trivially holds if u*(Q) =. If u*(Q) < 0%, choose B),B2,--- € # such 
that 0 C U,>1 Bn. Then B, A,B, NCA € F& for all n > 1, and 


QNAC |J BNA and QNCAC |) B, NCA. 


n>1 n>1 
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Using first the definition of 1* and then the additivity of on Z&, we find 
u*(QNA)+u*(@NCA) < Y u(B,NA)+ Y w(B,NCA) = Yu (Bn) 


n>1 n>1 n>1 
Taking the infimum over all admissible sequences B;,B2,... as specified above, we 
obtain u*(QNA) + u*(QNCA) < p*(Q). Combining both estimates, we conclude that 
A€ My . 

Step 2 — In this step we prove (ii). Let A € &. It is clear that u*(A) < w(A); this 
follows by taking B; = A and B, = @ for n > 2 in (E.1). The converse estimate U(A) < 
L*(A) trivially holds if 4*(A) =o. If u*(A) < o%, choose B),B2,--- € & such that 
A © Uns1 Bn. Then, by Lemma E.8(1) and (3) applied to A = U,3; AN By (part (3) can 
be applied by the assumed countable additivity of u on 2), 


A) < YE M(ANBn) < Yo (Bn) 


n>1 n>1 


Taking the infimum over all admissible sequences B;,B2,... as specified above, we 
obtain u(A) < p*(A). 

Let now the assumptions of part (2) be satisfied and choose pairwise disjoint sets 
Sn € O(B) such that S = U5 Sn and u*(S;,) < c and v(S,) < oo. Then the restrictions 
of * and v agree on the o-algebras {F NS, : F € o(#)} in S, by Dynkin’s lemma 
(which can be applied, noting that the collections By := {RNS,: R€ B} are rings in 
S, and hence are closed under finite intersections). By countable additivity, this in turn 
implies that u* and v agree on o(#). 


To verify the countable additivity condition in Carathéodory’s result one may use the 
following sufficient condition. 


Proposition E.10. Let @ be a ring in a set S and let u : & — |0,~] be an additive map 
with the property that u(@) = 0. If for each nonincreasing sequence (An )n>1 in & with 
Qn>14n = @ we have limps. (An) = 0, then [ is countably additive on &. 


Proof Let (Bj)j>1 be a disjoint sequence in # with B := U5; Bj € Z. We need to 
show that 


B) = u(B)). (EA) 


jz 
Let An = Ujsn Bj = B\ (Bi U---UB,-1). Then A, € & and A, | @, and therefore 
L(A,) — 0 by assumption. On the other hand, 


L(B) =uU(A, UB, UB) U-:-UBn-1) = U(An) + \ H(Bj). 


Therefore, 0 < p(B) — We L(B;) = U(An) > 0 and (E.4) follows. 
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Lebesgue Measure 


As a first application of Carathéodory’s theorem we construct the Lebesgue measure. 
For a = (a1,...,aq) and b = (b,,...,by) such that aj < b; for j = 1,...,d we write 


(a,b] = {x ER? : aj <x; < bj, j=1,...,d}. 


The collection .7@ of all finite unions of half-open rectangles is a ring and every set in 
£* can be written as a finite union of disjoint half-open rectangles. 
For I = (a, b] let 


For A € #4 of the form A = J, U--- UJ, with disjoint J; € 44% define Ay : 74 — [0, >] 
by 


n 
= lal. 
j=l 


We must check that this number is well defined. To this end suppose that A = (a1, a1] U 

--U (am, am] = (c1,d1]U-++-U (cn, dn] are two representations of A as unions of disjoint 
half-open rectangles. Then Jj; = (a;,bi]  (cj,dj] is either empty or a nonempty half- 
open rectangle, and we have 


m 


Uti=( (cj,dj] and Cre (ai, bi] 


j=l 
From the definition and the disjointness of the sets J;; we obtain 


m 


Au (aisbl) = au Yt) = as) 
i= I= j=l =1 J= 
= YY Ault) (ij) 


which proves the asserted well-definedness. 
When the dimension d is fixed and there is no danger of confusion we write A for Ay. 


n m 


= ¥ aa( Ut) = ya (cj,4j 


j=l i=1 


Lemma E.11. The function 2 : %4 — [0,0] is countably additive on 4 


Proof By Proposition E.10 it suffices to prove that for each sequence (A,)n>1 in 44 
satisfying A, | @ we have (A,) — 0. Fix such a sequence (A,,)n>1 and let € > 0. We 
have to find N € N such that A(A,,) < € for alln > N. 

Step 1 — For each n € N choose a B, € -%@ such that B, C A, and A(An\ Bn) < 2-8. 
Since B, C An, we also have (),,5.; Bn = @. It follows that the complements of the sets B, 
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form an open cover of the set A,, which is compact by the Bolzano—Weierstrass theorem. 
Therefore, there exists an N such that A; C UX_, CB,. It follows that )Y_, By C CA. 
Since B, C A for all n > 1, we must have that (\*_, B, = 2. 

Step 2 — Let C, = ‘Vj=1 Bj for n > 1. For every n > 1, A, \ Cy, = Uja1(An \ Bj) G 

(Aj \ Bj). Therefore, using Lemma E.8(1) in (*) and the subadditivity of A in 
(*), the verification of which we leave as an easy exercise, we find 

(*) Hs (¥*) tt n 
(An \ Cn) < a(Uts\8))) < Pi A(Aj\B)) < Vi 2Ve<e. 
j=l j=l 


j=l 


Since C, = © for all n > N, we can conclude that A(A,) =A(An\ Cn) < € for every 
n>N. 


Theorem E.12 (Lebesgue measure). There exists a unique Borel measure 4 on R¢ 
satisfying 
A) = |I| 


for alll € ¥* Moreover, for all h € R4 and A € A(R“) we have 
A(A+h) =A(A) 
where A+h:= {x+h:x€ A}. 


Proof In Lemma E.11 we have shown that A is countably additive on the ring .74 
Therefore, by Theorem E.9, A admits a unique extension to a o-finite measure on 
o( 44) = B&R’). 

To prove translation invariance let h € R“ We claim that for every A € A(R“) one 
has A+h € &(R®). For this let % = {A € B(R4) :A+he A(R4‘)}. By definition 
&@y, © BR“). One can check that ./, is a o-algebra. For each open set A one has A +h 
is open and hence A +h € A(R“). Therefore, A(R“) = o({open sets}) C ~, and the 
claim follows. 

For A € A(R®) set u,(A) := A(A +h). Then py, is a measure on A(R?) and for any 
half-open rectangle J, U,(1) = |I+h| = || =A (J). By uniqueness, we find f1,(A) =A(A 
for all A € B(R4). 


Product Measures 


As a second application of Carathéodory’s theorem we prove the existence of product 
measures. 


Theorem E.13 (Product measures). Let (Q;,.F;,Uj), j= 1,...,n, be O-finite measure 
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spaces. Then there exists a unique 0-finite measure U = j= Mn on the product o- 
algebra F = |]jj_, Fj which satisfies 


n 


wR x9 Fa) = [Pl 


whenever the sets Fj € F; satisfy uj(Fj) < © for j=1,...,n. 
The measure is called the product of [y,...,Un- 


Proof Let & be the ring consisting of all finite unions of measurable rectangles of 
finite measure, that is, sets of the form |]'j_; Fj with Fj € F; satisfying Uj(Fj) < 
for j = 1,...,n. Since the intersection of finitely many measurable rectangles of finite 
measure is a measurable rectangle of finite measure, every R € & can be written as a 
finite union of disjoint measurable rectangles of finite measure, say R = R“) U--- UR 
and we may define 


where each u(RY) is given by the product formula in the statement of the theorem. 
The proof that 41(R) is well defined follows the lines of the proof for the Lebesgue 
measure. It is clear that ps is additive on &. We claim that yu is countably additive on 
&. Once we know this, the existence of a unique 0-finite product measure follows from 
Carathéodory’s theorem. 

A quick proof of the claim is obtained by applying Proposition E.10 in combination 
with the dominated convergence theorem. The reader may check that no circularity is 
introduced by borrowing this result at this stage. Thus let (Aj); be a nonincreasing 
sequence of sets in # satisfying (];;A; = 2. We must show that limj_,.. (Aj) = 0. 
We have 


Qn Qy 


using that A; is the finite union of measurable rectangles of finite measure and that 
the identity holds for such sets by definition. The asserted convergence now follows by 
applying dominated convergence n times consecutively. 


Example E.14. The Lebesgue measure on (R%, A(R“)) is the product measure of d 
copies of the Lebesgue measure on (R, A(R)). 


Borel Measures on Metric Spaces 


Definition E.15 (Regularity). A Borel measure 1 on a topological space X is called: 
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e inner regular, if for all Borel subsets B of X and all € > 0 there is a closed set F in X 
such that F C Band u(B\ F) <e; 


e outer regular, if for every Borel subset B of X and all € > 0 there is an open set U in 
X such that BC U and u(U \B) <e€; 


e regular, if it is inner regular and outer regular. 
Proposition E.16 (Regularity). Every finite Borel measure on a metric space is regular. 


Proof Let p be a finite Borel measure on a metric space X. Denote by </(X) the 
collection of all Borel sets A in X which have the property that for all € > 0 there exist a 
closed set F and an open set U such that F CA CU and u(U \ F) < €. We must prove 
that #(X) = A(X). 

Claim 1: /(X) is a o-algebra. It is clear that @ € o/(X) and that /(X) is closed 
under taking complements. To see that ./(X) is closed under taking countable unions, 
let (An)n>1 be sets in o/(X). Let € > 0 be given and let (F,)n51 and (Un)n>1 be se- 
quences of closed and open sets such that F, C A, C U, and U(U, \ Fr) < ak The set 
U = Uns1 Un is open. In view of 


H(¥\ UA) < Le Un\Fr) <€ 


there exists an index N such that 
N 
LL (u \U Fx) <€. 
n=1 


The set F := Uf, Fr is closed, satisfies F C Uns An C U, and u(U \ F) <€. 


Claim 2: of (X) contains all closed subsets of X. To see this, let F be a closed subset 
of X and define, for k > 1, Up := {x €X: d(x,F) < }}. Then each U; is open and we 
have ()¢>1 Ux = F. Hence, limy_,.. U(U;) = U(F) and the claim follows. 


Combining the two claims we see that </(X) = &(X). 


In order to state the theorem we need the following terminology. 


Definition E.17 (Tightness). A Borel measure Lt on a topological space X is called tight 
if for every € > 0 there exists a compact set K in X such that u(X \ K) <e. 


The following proposition gives a sufficient condition for tightness. 


Proposition E.18. Every Borel probability measure U on a separable complete metric 
space X is tight. 
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Proof Let (Xn)n>1 be a dense sequence in X and fix € > 0. For each integer k > 1 the 
closed balls B(x; i) cover X, and therefore there exists an index N; > 1 such that 


u(UB(s})) 21-5 
n=1 
The set 
Nx 
Ki= a U B(Xn3 Z) 
k>1n=1 


is closed and totally bounded. Since X is assumed to be complete, K is compact by 
Theorem D.10. Moreover, 


u(CK)< ¥ S =e. 


k>1 


For separable complete metric spaces, this result implies the following improvement 
to the inner regularity of Borel measures provided by Proposition E. 16. 


Corollary E.19. Let u be a Borel probability measure on a separable complete metric 
space E. Then for all Borel subsets B of E and all € > 0 there is a compact set K in E 
such that K C B and u(B\ K) <€. 


Definition E.20. A Borel measure i on a topological space X is called Radon, if for 
every Borel subset B of X and all € > 0 there is a compact set K C X and an open set 
U CX such that K CB CU andu(U\K) <e. 


Proposition E.21. Let u be a finite Borel measure on a separable complete metric 
space X. Then w is Radon. 


Proof The measure p is regular by Proposition E.16, so for every Borel set B there is 
a closed set F C X and an open set U C X such that K CB CU and u(U\ F) < 5€. In 
particular we have u(B\ F) < 5€. By Proposition E.18, j is tight, so there is a compact 
set K such that p(X \K) < 4¢. The compact set F 1K is contained in B and satisfies 
L(B\ (FNK)) <é. 


Appendix F 


Integration 


In this appendix we review the Lebesgue integral. 


Measurable Functions 


Let (Q1,-¥) and (Q2,-¥2) be measurable spaces. A function f : Q| > Q2 is said to 
be measurable if {f € F} € F, for all F € Fy. Clearly, compositions of measurable 
functions are measurable. If @ is a subset of 72 with the property that o(G2) = Fo, 
then a function f : Q; — Q2 is measurable if and only if 


{fECcle F forall C € %. 


Indeed, just notice that {F € F2: {f € F} € F)} is a sub-o-algebra of > containing 
62. = 

Let (Q,.#) and (Q,.¥) be measurable spaces and let 1 be a measure on (Q,.F). If 
f: a> Qis measurable, then 


f(U(F) =i feck}, Fe F, 


defines a measure f, (1) on (Q, F), This measure is called the image of uw under f. In 
the context where (Q, .,P) is a probability space, the image of P under f is called the 
distribution of f. 

In most applications we are concerned with functions f from a measurable space 
(Q,F) to (K, A(K)). Such functions are said to be Borel measurable. In what fol- 
lows we summarise some measurability properties of Borel measurable functions (the 
adjective ‘Borel’ is omitted when no confusion is likely to arise). 

Since open sets are Borel, continuous functions are Borel measurable. 

By the observation made earlier, a function f : — R is Borel measurable if and 
only if {f > a} € F for all a € R, and a function f : Q > C is Borel measurable 
if and only if {Ref > a,Imf > b} € F for all a,b € R. From this it follows that 
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linear combinations, products, and quotients (if defined) of measurable functions are 
measurable. For example, if the real-valued functions f and g are measurable, then 
f +g is measurable since 


{f+s>a}= U{f>a}n{eg>a-q}. 
qEQ 
The measurability of the sum of two complex-valued measurable functions is proved in 
the same way. 

If f :Q— K and g: Q— K are measurable, then fg = 5[(f +)* — (f7 +.7)] is 
measurable. 

If f :Q— C is measurable, then its complex conjugate f is measurable (it is the com- 
position of the measurable function f and the continuous function z +> Z), and therefore 
Ref = 5( f+f) and Im f = x( f —f) are measurable. Conversely, if Re f and Im f are 
measurable, so is f = Re f+ilmf. 

If f = sup, fn pointwise and each f,, : Q — R is measurable, then f is measurable 


since 
{f >a} = Uth > a}. 
n>1 
It follows from infy>1 fn = — sup, ,(—fn) that the pointwise infimum of a sequence of 


measurable functions is measurable as well. From this we get that the pointwise limit 
superior and limit inferior of measurable functions are measurable, since 


limsup f, = lim (sup fi) = inf (sup fc) 
n—yoo no \ kn nz kon 


and liminf;,_,.0 fn = —limsup,,_,..(—fn). It follows that the pointwise limit lim,_,.. f, of 
a sequence of measurable functions f;, : Q — R is measurable. By considering real and 
imaginary parts separately, the latter extends to pointwise limits of functions f, :Q— C. 

In the above considerations involving suprema and infima, it is implicitly assumed 
that these suprema and infima exist and are finite pointwise. This restriction can be 
lifted by considering functions f : Q— [—c°,ce]. Such functions are said to be Borel 
measurable if the sets {f € B}, B € A(R), as well as the sets {f =o} and { f = —oo} 
are in F. 

A simple function is a function f : Q — K that can be represented in the form 


N 
f = > Cop, 
n=1 


with coefficients c, € K and disjoint sets F, € ¥ for alln =1,...,N. 


Proposition F.1. A function f :Q— K is measurable if and only if it is the pointwise 
limit of a sequence of simple functions fy, : Q — IK, which may be chosen to satisfy 0 < 
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\fal t |f| pointwise. If f is bounded, it may in addition be arranged that the convergence 
is uniform. If f is nonnegative, we may furthermore arrange that 0 < fn + f pointwise, 
and uniformly if f is bounded. 


Proof There is no loss of generality in taking K = C. 
The ‘if’ part is clear from the fact that measurability is preserved under taking point- 
wise limits. It remains to prove the ‘only if’ part. 


() 


To prove the first assertion, for j,k € Z and n € N consider the rectangles Ri = 


[As i) +i [x, A ) in the complex plane. Let oy be the unique point in the closure 


of Be with minimum modulus. Then the simple functions 


Q2n ( ) 
I => ra 1 (n) 
: ein HL FER 
satisfy f, > f and 0 < |f,| t |f| pointwise. If f is bounded, the convergence is uniform. 
To prove the second assertion, for j € N and n € N consider the intervals rhe = 
[A i) in the nonnegative real line. Then the simple functions 


g2n j 
= =. 'n 
In d n {fe} 


satisfy 0 < f, + f pointwise. If f is bounded, the convergence is uniform. 


The Lebesgue Integral 


We now construct the Lebesgue integral and study its properties. The construction of 
the integral proceeds in two stages: in the first step, the Lebesgue integral of an arbi- 
trary nonnegative measurable function is defined (allowing the value oo); in the second 
step, the notion of integrability is introduced and the Lebesgue integral of an integrable 
function is defined. 

Let (Q, F, “) be a measure space. For a nonnegative simple function f = Y*_, cn1r, 
we define 


N 
| fdp = ¥ cap (F). 
Q n=1 


We allow p(F,) to be infinite; this causes no problems because the coefficients c, are 
nonnegative (we use the convention 0- oo = 0). It is easy to check that this integral is 
well defined, in the sense that it does not depend on the particular representation of f as 
a simple function. Also, the integral is linear with respect to addition and multiplication 
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with nonnegative scalars, 


[af +bedu =a | fau+b | edu, 
Q Q Q 


and monotone in the sense that if 0 < f < g pointwise, then 


[fous f gau. 
Q Q 


In what follows, a nonnegative function is a function with values in [0,c]. Recall 
that such a function is said to be Borel measurable if the sets {f € B}, B € A(R), as 
well as the set {f = oo} is in ¥. In what follows, ‘measurable’ always means ‘Borel 
measurable’. 

For a nonnegative measurable function f we choose a sequence of simple functions 
0 < fn t f (see Proposition F.1) and define 


dy := li nd. 
[fou lim [fod 


The following lemma implies that this definition does not depend on the approximating 
sequence. 


Lemma F.2. For a nonnegative measurable function f and nonnegative simple func- 
tions f, and g such that0 < fn t f and g < f pointwise we have 


Jo gau stim, | fra. 
Q no JQ 


Proof First consider the case g = 1. Fix € > 0 arbitrary and let F,, := {1 f, > 1—e}. 
Then Fi C Fy C... and Uys) Fn = F, and therefore u(F;,) t U(F). Since 1p f, > (1 — 
e)lp,, 


lim [ fodu > lim f tefadu > (1-e) lim w(F,) 


=(1-e)u(F) = (1-8) f gay. 


This proves the lemma for g = 1. The general case follows by linearity. 


The integral is linear and monotone on the set of nonnegative measurable functions. 
Indeed, if f and g are such functions and 0 < f, t f and0 < g, t g, then for a,b > 0 we 
have 0 < af, + bgy + af + bg and therefore 


[af +bedu = lim | afr + bend 
Q neo JO 


=alim f frau +biim f edu =a [ fau +b | gdu. 
nye JO neo JO Q Q 
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If such f and g satisfy f < g pointwise, then from 0 < fy < max{ fn, gn} tg we see that 
[few = jim f feu < jim | max{fngnbdu < [| dy. 


Let us now take a closer look at the role of null sets. We begin with a simple obser- 
vation. 


Proposition F.3. If f is a nonnegative measurable function, then: 


(1) if fo f dp <%, then L{f =e} =0; 
(2) if Jo fdpt =0, then wf f #0} =O. 


Proof For all c > 0 we have 0 < cly foo} f and therefore 
o<eu{f=o}< | sau. 


The first result follows from this by letting c — oe. For the second, note that for all n > 1 
we have 1 pty < f and therefore 


wu{re i} < [ rau =o. 


It follows that u{ f > +} = 0. Now note that {f > 0} =Unsiff > 4}. 


The Monotone Convergence Theorem 
The next theorem is the cornerstone of Integration Theory. 


Theorem F.4 (Monotone Convergence Theorem). Let0 < fi < fo <... be a sequence 
of nonnegative measurable functions converging pointwise to a function f. Then, 


lim [ fad = [L fau. 
Proof First note that f is nonnegative and measurable. For each n > 1 choose a se- 
quence of simple functions 0 < fnx tk fn. Set 

nk = Max fix,---,fnk}- 


For m <n we have gink < Zng. Also, for k <1 we have fink < fin, m= 1,...,n, and 
therefore g,,% < gj. We conclude that the functions g,,; are monotone in both ‘dice’ 

From fink < fin < fn, 1 < m <n, we see that far < gnk < fn, and we conclude that 
0 < 8nk te fa. From 


fc= jim N &nk S am Nn 8kk < f 
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we deduce that 0 < gx, + f. Recalling that gp, < f; it follows that 


[few=lim f sicdu < tim f fidu < f fay. 
Q kano JQ kre JQ Q 


Example F.5. We have the following substitution formula. For any measurable f/f : 
Q — Q2 and nonnegative measurable @ : Q2 > R, 


[, oofon= f, oahu). 


To prove this, note that this is trivial for simple functions @ = 1p with F € Fo. By 
linearity, the identity extends to nonnegative simple functions @, and by monotone con- 
vergence (using Proposition F.1) to nonnegative measurable functions @. 


From the monotone convergence theorem we deduce the following useful corollary. 


Theorem F.6 (Fatou’s Lemma). Let (fn)n>1 be a sequence of nonnegative measurable 
functions on (Q,.F, UW). Then 


| liminf f,du < limint [ Sndu. 
Q Nn-yoo noo Jo 
Proof From infgsn fk < fn, m = n, we infer 


i: inf f,du < int | fn GL. 
Q m>nJo 


k>n 


Hence, by the monotone convergence theorem, 
liming fa — fim inf ft = fim fink fd 


< lim inf i fn dt = liminf | fod. 
Q neo JQ 


n->om>n 


The Dominated Convergence Theorem 


A measurable function f : Q — K is called integrable if 


[ilfidu <e. 


Clearly, if f and g are measurable and |g| < |f| pointwise, then g is integrable if f is 
integrable. In particular, if f is integrable, then the nonnegative functions f* and f~ 
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are integrable, and we define 


[fouim [pron fran. 


For aset F € ¥ we write 
| fewi= [ tetan, 
F Q 


noting that 1; f is integrable. The monotonicity and additivity properties of this integral 
carry over to this more general situation, provided we assume that the functions we 
integrate are integrable. 

The next result is among the most useful in all of Analysis. 


Theorem F.7 (Dominated Convergence Theorem). Let (fn)n>1 be a sequence of inte- 
grable functions such that limy_-+. fn = f pointwise. If there exists an integrable function 
g such that |fn| < |g| for alln > 1, then 


jim nf \fn—f\du =0. 
In particular, 


lim | fod = ff fay. 


noo 


Proof We make the preliminary observation that if (h,)n>1 is a sequence of nonneg- 
ative measurable functions such that lim,_,..4, = 0 pointwise and h is a nonnegative 
integrable function such that h, < h for all n > 1, then by the Fatou lemma 


[tau = liminf(h —h,)d w<timint [| h— h, dp = [tau —limsup [) hy, du. 
Q 


Q noe Nn—yoo Nn-yeo 


Since fo hdy is finite, it follows that 0 < limsup,, ,.. Jo 4nd < 0 and therefore 


Jim [indy =0. 


The theorem follows by applying this to h, = |f, — f| and h = 2|g|. 


If f is integrable and u{f 4 0} = 0, then 


[fou =0. 


Indeed, by considering f* and f~ separately we may assume f is nonnegative. Choose 
simple functions 0 < f, t f. Then {fn > 0} <u{f >0}=Oand therefore fy frdu =0 
for all n > 1. The claim follows from this. Consequently in the main results of the previ- 
ous section, in particular in the monotone convergence theorem (Theorem F.4) and the 
dominated convergence theorem (Theorem F.7), we may replace pointwise convergence 
by pointwise convergence L-almost everywhere, where the latter means that we allow 
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an exceptional set of {4-measure zero in the assumptions. For instance, in the monotone 
convergence theorem it suffices to assume that 0 < f,, + f pointwise U-almost every- 
where, and similarly in the dominated convergence theorem it suffices to assume that 
limys00 fn = f pointwise -almost everywhere and |f,| < |g| pointwise U-almost ev- 
erywhere for all n. 


Fubini’s Theorem 


Proposition F.8. Let (Q1,.71) and (Q2,.F2) be measurable spaces and let f : Q x 
Q») > K be measurable with respect to the product 0-algebra F\ x F2. Then: 


(1) the function @2 ++ f(@1,@2) is measurable; 
(2) the function @ ++ f(@1,@2) is measurable. 


Proof The collection ¥ of all sets F € Y, x 2 such that (1) and (2) hold for f = 1 
is a o-algebra containing every set of the form F, x Fy with Fj € F; and Fy € F». Since 
F| X F is the smallest o-algebra containing these sets, it follows that ¥ = F, x Fo. 

This proves that the proposition holds for all indicator functions 1f with F € YF, x 
>. By taking linear combinations, the result extends to simple functions. The result for 
arbitrary measurable functions then follows by pointwise approximation with simple 


functions. 


Theorem F.9 (Fubini, first version). Let (Qy,-F1, U1) and (Qo, F2, Un) be o-finite mea- 
sure spaces. If f : Q Xx Q2 — K is nonnegative and measurable with respect to the 
product o-algebra F\ X F2, then: 


(1) the [0,0°]-valued function @ + Jo, f(@1, @2) dui (@1) is measurable; 
(2) the [0,°°]-valued function @ + Jo, f(@1,@2) di2(@2) is measurable; 
(3) we have 


J. fal xue)= |) f faiaue= ff fauraun. 
Q) x Qo Qy JQ, Qy JQ 


Proof First suppose that 1j(Q1) = f2(Q2) = 1 and let ¥ be the collection of all 
sets F € F, x F2 such that (1)-(3) hold for f = 1-. We claim that ¥ is a o-algebra. 
Indeed, (1)—(3) are trivial for f = 1g = 0. If (1)-(3) hold for a set F € YF; x Fo, then 
1p (@1,@2) = 1—1,(@), @) implies that (1) and (2) hold for CF, and furthermore 


| Upp d(1 X ba) =| 1— 1p d(x ba) 
Q1xQ) Q) xQy 


=1-f Ap d(i x Ho) =1- f | 1p duty dun 
Q) x Q2 Qs Q) 
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=a) | 11d dy = | i Ipp du duz 
Q) JQ) Q) JQ) 


and similarly for the other repeated integral, so (3) holds for F. If (1)-(3) hold for 
disjoint sets 1y,,1p,,--- € FA x F2, the monotone convergence theorem implies that 
(1)-(3) hold for 1p with F = U,,51 Fn. This proves the claim. It is clear that (1)-(3) hold 
for all rectangles F, x Fy with Fy € YF; and Fy € Fy. Since F| x F» is the smallest 
o-algebra containing these rectangles, it follows that_ F = F, x F2. 

If 4; and Up are finite, we apply the preceding step to the normalised measures 
Bi /bi(Q1) and My /M2(Qz2) and again find that (1)-(3) hold for all F € ¥,; x Fo. This 
extends to the o-finite case by approximation and monotone convergence. 

By taking linear combinations, the result extends to nonnegative simple functions. 
The result for arbitrary nonnegative measurable functions then follows by another ap- 
plication of monotone convergence. 


A variation of the Fubini theorem holds if we replace ‘nonnegative measurable’ by 
‘integrable’: 


Theorem F.10 (Fubini, second version). Let (Q1,-¥1, U1) and (Qo, F2, U2) be o-finite 
measure spaces. If f :Q x Q2 > K is integrable with respect to the product measure 
Ly X Lb, then: 


(1) the function @) ++ Jo, f(@1,@2) dpi (@1) is integrable with respect to [ln; 
(2) the function 0 ++ Jo, f(@1,@2) dul2(@z) is integrable with respect to [1; 
(3) we have 


J. fan xie)= ff fama = ff fouedun. 
Q1 xQ2 Qy JQ) Q) JQ 


Proof By splitting into real and imaginary parts and then into positive and negative 
parts, we may assume that f is nonnegative. Hence (3) holds by Theorem F.9, with a 
finite left-hand side. It follows that the two repeated integrals are finite. Since an integral 
with respect to a measure U of a nonnegative function is finite only if the integrand is 
finite 1-almost everywhere, assertions (1) and (2) follow as well. 


Appendix G 
Notes 


Historical perspectives on Functional Analysis are presented in Dieudonné (1981); 
Monna (1973); Pietsch (2007). Among the many excellent textbooks on Functional 
Analysis, our favourites include Bressan (2013); Brezis (2011); Conway (1990); Dun- 
ford and Schwartz (1988a); Einsiedler and Ward (2017); Lax (2002); Rudin (1991); 
Schechter (2002); Werner (2000); Yosida (1980). 


Chapter 1 


Exhaustive treatments of the theory of Banach spaces and the Bochner integral are given 
in Albiac and Kalton (2006); Diestel (1984); Dunford and Schwartz (1988a); Li and 
Queffélec (2004) and Diestel and Uhl (1977); Dunford and Schwartz (1988a); Hyt6nen 
et al. (2016), respectively. 


Chapter 2 


The proofs of Propositions 2.19 and 2.23 are taken from Haase (2007). Our presenta- 
tion of Proposition 2.34 and Section 2.3.d follows Hytonen et al. (2016), where more 
detailed information on this topic can be found. The classical reference is Stein (1970). 
The Fréchet-Kolmogorov compactness theorem is usually stated for bounded subsets 
of L?(R¢). That boundedness follows from the assumptions (i) and (ii) was observed 
later by Sudakov; the simple proof presented here is from Hanche-Olsen et al. (2019). 
The presentation of Section 2.3.d follows Hytonen et al. (2016). 

Our treatment of Theorems 2.45 and 2.46 follows Bogachev (2007b). 

Comprehensive treatments of vector lattices, Banach lattices, and positive operators 
are given in Aliprantis and Burkinshaw (1985); Luxemburg and Zaanen (1971); Meyer- 
Nieberg (1991); Schaefer (1974); Zaanen (1997). 
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The problem of proving the boundedness of T @ Jy on L?(Q;X) for bounded oper- 
ators T on L?(Q) discussed in Problem 2.28 is highly nontrivial. An interesting com- 
plement to the results mentioned in the problem is the following result of Paley and 
Marcinkiewicz—Zygmund: if T is bounded on L?(Q) with 1 < p < c and X is a Hilbert 
space, then T @ Jy admits a unique extension to a bounded operator on L?(Q;X) and 
its norm equals ||7'||. The proof uses properties of Gaussian random variables. It was 
shown by Kwapien that the Fourier—Plancherel transform (see Section 5.5) extends to 
a bounded operator on L7(IR¢;X) if and only if X is isomorphic to a Hilbert space (and 
as such is unitary if X is a Hilbert space); by results of Bourgain and Burkholder, the 
Hilbert transform (see Section 5.6) extends to a bounded operator on L?(R;X) for some 
1 < p< if and only if it extends to a bounded operator on L?(R;X) for all 1 < p< 
if and only if X has the so-called UMD property; this abbreviation stands for “uncondi- 
tionality of martingale differences”. Proofs of these results and their ramifications can 
be found in Hyténen et al. (2016); Pisier (2016). The UMD property also characterises 
the boundedness of the vector-valued extension of the It6 stochastic integral of Problem 
3.25; see van Neerven et al. (2015) and the references given therein. 


Chapter 3 


Theorem 3.13 characterises Hilbert spaces up to isomorphism. More precisely, the fol- 
lowing deep theorem has been proved in Lindenstrauss and Tzafriri (1971): A Banach 
space X is isomorphic to a Hilbert space if and only if every closed subspace of X is the 
range of a bounded projection in X. 

The proof of the Radon—Nikodym theorem outlined in Problem 3.22 is due to von 
Neumann and follows Rudin (1987). The construction, in Problem 3.24, of a linear op- 
erator on 7 which fails to be bounded depends on the existence of an algebraic basis 
in @ (see Problem 3.23) which in turn is deduced with the help of Zorn’s lemma. The 
latter being equivalent to the Axiom of Choice, this raises the question whether a con- 
structive example of an unbounded operator can be given. Within Zermelo—Fraenkel Set 
Theory (ZF) the answer is negative: it is consistent with ZF that every linear operator 
on a Banach space is bounded. In fact, it is a theorem in ZF extended with the so-called 
Axiom of Determinacy and the Countable Axiom of Choice that every linear operator 
on a Banach space is bounded (Fremlin, 2015, Theorem 567H (c)). 

It can be quite hard to decide whether a given subspace is dense in a given Hilbert 
space. The following example may illustrate this. Let H denote the Hilbert space of all 
scalar sequences c = (Cn)n>1 for which the norm 


1 
oe sk 2 
lel = L Sle 


n>1 
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is finite. For x € R let {x} denote its fractional part, that is, the unique real number 
in [0,1) such that x = k + {x} for some integer k € Z. For m = 1,2,3,... let c”) := 
({4})n>1 and note that these sequences belong to H. It is a theorem of Nyman and 
Baez—Duarte that the linear span of the sequence (c™) )m>1 is dense in H if and only if 
the Riemann hypothesis holds. This result, as well as several related ones, is surveyed in 
Baghi (2006). The Riemann hypothesis is considered by many mathematicians as one 
of the most important open problems in all of Mathematics. 


Chapter 4 


Our proof of Theorem 4.2 combines ideas of Folland (1999) and Ruzhansky and Tu- 
runen (2010). One should be aware that different authors use slightly different defini- 
tions of Radon measures. 

The real version of the Hahn—Banach theorem is due to Banach; the extension to 
complex scalars was added a decade later by Hahn. Banach also proved the sequen- 
tial version of the Banach—Alaoglu theorem; the general version is due to Alaoglu. A 
detailed survey of the Hahn—Banach theorem is given in Buskes (1993). 

The weak and weak* topologies are special cases of so-called locally convex topolo- 
gies. For systematic introductions to this subject we recommend Aliprantis and Burkin- 
shaw (1985); Conway (1990); Rudin (1991). Theorem 4.47 is a special case of the 
so-called principle of local reflexivity. Its full formulation can be found, for example, in 
Albiac and Kalton (2006). 

Our proof of Theorem 4.63 is closely related to that presented in Bogachev (2007a), 
where more refined versions of the theorem can be found. 

The result of Problem 4.10 is discussed in Phelps (1960). The converse also holds: If 
every functional on a closed subspace on X has a unique Hahn—Banach extension of the 
same norm, then X* is strictly convex; see Foguel (1958); Taylor (1939). 

The result of Problem 4.32 is due to Sobezyk (1941). 


Chapter 5 


General references on the theory of bounded operators include Beauzamy (1988); Go- 
hberg et al. (2003, 2013); Nikolski (2002). 

The proof of the uniform boundedness theorem sketched in Problem 5.3 is taken from 
Pietsch (2007), where it is credited to Lebesgue. 

Most treatments of the Fourier—Plancherel transform use the Schwartz space .7(R“) 
of rapidly decreasing smooth functions instead of our F(R). Our treatment of the 
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Fourier transform and the Hilbert transform follows that of Hytonen et al. (2016). The 
L?-boundedness of the Hilbert transform is classical; we follow Grafakos (2008). 

The theory of Fourier multiplier operators can be meaningfully extended to the L?- 
setting, where it becomes a powerful tool in the Calderén—Zygmund theory of singular 
integral operators. The prime example of such an operator is the Hilbert transform. 
Detailed treatments of the Hilbert transform and singular integral operators in the L?- 
setting are given in Grafakos (2008); Stein (1970, 1993). The theory of the Hilbert 
transform extends to higher dimensions where analogous statements hold for the Riesz 
transforms, defined as the Fourier multipliers operators associated with the functions 
m; € L®(R“) defined by 


(Ey SE 
m(S) = Tg) GH Need: 


An exhaustive treatment of these matters belongs to the realm of Harmonic Analysis; 
see Grafakos (2008); Stein (1970) and Chapter 5 of Hyténen et al. (2016). 

The proof of the Riesz—-Thorin theorem 5.39 presented here is taken from Hytoénen 
et al. (2016), where also the argument proving ||Tc|| = ||7|| can be found. In this ref- 
erence, the proof of the Clarkson inequalities sketched in Problem 5.26 is attributed 
to Jiirgen Voigt. It is a famous result of Beckner (1975) that the constant | in the 
Hausdorff—Young inequality 


IF flcacpem) SIF lee eam) 


for the Fourier transform with respect to the normalised Lebesgue measure m, where 
1<p<2and a = 1, can be improved 


IF F lla m) < Collflluoeam) 


with C, = (p!/P /q'/4)'/2. In the same paper, Beckner proved the improvement to the 
Young inequality mentioned in the main text and showed that both results are sharp. 
The proofs rely on (but go beyond) the techniques developed in Section 15.6. 

The Marcinkiewicz interpolation theorem is of fundamental importance in the the- 
ory of singular integrals; see Grafakos (2008); Stein (1970) and the forthcoming work 
Hytonen et al. (2022+). Our treatment follows Hytonen et al. (2016). The L?-bounded- 
ness of the Hilbert transform, here derived as a consequence of the Riesz—Thorin the- 
orem, can also be derived from the Marcinkiewicz interpolation theorem; the required 
weak L!-bound is due to Kolmogorov; see Duoandikoetxea (2001). 

The result of Problem 5.16 is due to Pettis (1938). It is no coincidence that the 
counterexample for p = 1 in part (b) lives in the space co: part (a) extends to p = 1 
for all Banach spaces not containing a closed subspace isomorphic to co. A proof of this 
fact is given in Diestel and Uhl (1977); further results on the Pettis integral can be found 
in van Dulst (1989); Musiat (2002); Talagrand (1984). 
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Chapter 6 


There are many excellent treatments of spectral theory, such as the monumental classic 
Dunford and Schwartz (1988b) and the monographs Arveson (2002); Aupetit (1991); 
Miiller (2007). A discussion of the conditions (1) and (ii) for contours in Cauchy’s theo- 
rem can be found in Rudin (1987). The result of Problem 6.15 is due to Gelfand (1941); 
the proof outlined here is due to Allan and Ransford (1989). 


Chapter 7 


Our proofs of Proposition 7.27 and Theorems 7.29 and 7.31 follow Schechter (2002), 
Bleecker and Boo8 Bavnbek (2013), and Bottcher and Silbermann (2006), respectively. 
The proof of Theorem 7.33 follows Coburn’s original papers Coburn (1966, 1967); see 
also Arveson (2002). Another proof is given in Douglas (1998), where also the results 
from the theory of Hardy spaces alluded to in the proof can be found. Toeplitz operators 
are covered in more depth in Arveson (2002); Bottcher and Silbermann (2006); Douglas 
(1998); Nikolski (2002). 

The winding number of a continuous closed curve in C \ {0} parametrised by the 
function @ : [0,1] + C\ {0}, t o(e?7") can be computed as follows. One shows that 
there exists a continuous function g : [0,1] — C such that 


o(e7") = ermal) te (0, 1]. 


The identity e278) = 6(1) = e?8) implies that g(1) — g(0) € Z. This integer equals 
the winding number of @. Proofs and some easy consequences can be found in Arveson 
(2002). 

The result of Problem 7.4 has the following interesting complement, due to Pitt: For 
all 1 < p <q < ©, every bounded operator from @% to €? is compact. An immediate 
consequence is that the spaces £? and £4 are not isomorphic. The proof of Pitt’s theorem 
requires some effort; see for instance Albiac and Kalton (2006); Ryan (2002). The result 
of Problem 7.8 is due to Terzioglu (1971). 

By using some elementary C*-algebra techniques it is possible to derive Theorem 
7.31 as a simple corollary to Theorem 7.33. We begin by introducing some terminology. 
A Banach algebra is a Banach space & endowed with a composition mapping (x,y) 
xy such that 


llxyll < [ally 


holds for all x,y € <&. A unital Banach algebra is a Banach algebra < with a unit 
element, that is, an element e € .& such that 


ex=xe=x, xe. 
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The spectrum 6,7(x) of an element x of a unital Banach algebra .c/ is the set of all A € C 
for which no y € & can be found such that (A — x)y = y(A — x) =e, and most of the 
spectral theory contained in Chapter 6 can be routinely extended to this situation. 

A C*-algebra is a Banach algebra with an involution, that is, a mapping x +> x* on 
&@ satisfying 


(ety) =x*+y", (cx*=Ox*,  (xy)* = ys", 
as well as 
Hla] = We“, Worx] = Tel I 


for all x,y € & and c € C. According to the Gelfand—Naimark theorem, every C*- 
algebra & is x-isometric to a closed x-subalgebra of (H) for a suitably chosen Hilbert 
space H (a x-subalgebra being a subalgebra closed under taking involutions and a x- 
isometric isomorphism being an isometric isomorphism U with the additional properties 
that U(xy) = UxUy and U(x*) = (Ux)* for all x,y € &). This theorem connects the 
abstract definition of C*-algebras given here with the concrete approach taken in Section 
9.5. 

The following generalisation of Proposition 8.19 holds (see (Folland, 2016, Proposi- 
tion 1.23) or (Rudin, 1991, Theorem 11.29)): If & is a closed unital x-subalgebra of a 
unital C*-algebra &, then 


Ow(x)=Oa(x), xEew. (G.1) 


The proof follows Proposition 8.19, except for the fact that x = x* implies og(x*x) C 
R; this fact can be proved by combining the Gelfand—Naimark theorem and Proposi- 
tion 8.19. An elementary proof can be given as follows. If u € & is unitary, that is, 
uu* = u*u = e, the argument suggested in Problem 8.2 proves that og(u) € T. Then, 
the argument suggested in Problem 8.3 proves that if x = x*, then og(x) € R. 

Using (G.1), let us now give a simple alternative proof of Theorem 7.31 based on 
Theorem 7.33 (cf. Corollary 1 in (Arveson, 2002, Section 4.3)). The results of Problem 
7.12 prove that for any Banach space X the Calkin 2(X)/.#(X) is a unital Banach 
algebra. If H is a Hilbert space, then “(H)/.% (H) is a unital C*-algebra. 

In what follows we take H := H*(ID). Since .# (H) is contained in the Toeplitz alge- 
bra 7 it is meaningful to consider the quotient space 7 /.% (H). This space is a unital 
x-subalgebra of (H)/.# (H). By Coburn’s theorem the mapping Ty + K +> @ sets up 
a x-isometry from 7 onto C(T) and we have the commuting diagram 


0 —> #(H)) TF “ c(T) ———+ 0 
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where 7 are the quotient mappings and j is the composition of the x-isometry from 
C(T) onto 7 /.(H) provided by Theorem 7.33 and the natural inclusion mapping 
from 7/4 (H) into 2(H)/# (H). As such, j is injective. 

Now suppose that @ € C(T) is such that Ty € 7 is Fredholm. By Atkinson’s theorem 
there exists an S € Y(H7?(D)) such that J— TyS and J — ST» are compact. This means 
that Tj defines an invertible element in (H)/. (H). As an application of (G.1), Ty 
defines an invertible element in 7 /.% (H). A moment’s reflection reveals that this im- 
plies that Sc 7, say S=Ty+K with y € C(T) and K € # (H). It then follows that 


jow =2(TyTp) = 2(STp) = 2(1) = Jl, 


and the injectivity of j implies @y = 1. This is only possible if @ is zero-free. 


Chapter 8 


Our proof of Theorem 8.18 is taken from Rudin (1991). A proof of Runge’s theorem, 
which was used in the proof of part (1) of Theorem 8.20, may be found in Rudin (1987). 
The clever proof of Proposition 8.21 is taken from Whitley (1968). The proof of Theo- 
rem 8.36 is taken from Davies (1980). 

The proof of the Toeplitz—Hausdorff theorem proposed in Problem 8.9 is due to Li 
(1994). More about numerical ranges can be found in Gustafson and Rao (1997). 


Chapter 9 


Most treatments of the spectral theorem for normal operators proceed via the theory of 
C*-algebras; see, for example, Arveson (2002); Rudin (1991). This permits a concise 
abstract proof, but has the drawback that this theory depends on the existence of max- 
imal ideals, a well-known consequence of Zorn’s lemma. Our approach avoids the use 
of Zorn’s lemma. The idea to use Proposition 9.12 to prove that the projection-valued 
measure is concentrated on the spectrum is from Haase (2018). Our treatment of the von 
Neumann bicommutant theorem and the result stated in Problem 15.11 are taken from 
Pedersen (2018). The presentation of Section 9.6 follows Koelink (1996). 

In Heuser (2006) a direct, albeit tricky, proof is given of the result of Problem 9.13 
which relies solely on the continuous functional calculus for selfadjoint operators. 
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Chapter 10 


References for this chapter include Akhiezer and Glazman (1981a); Birman and Solom- 
jak (1987); Dunford and Schwartz (1988b); Edmunds and Evans (2018b); Kato (1995); 
Reed and Simon (1975); Schmiidgen (2012). Our proof of the spectral theorem com- 
bines elements of Rudin (1991) and Schmiidgen (2012), and is elementary in that it 
avoids the use of C*-algebra techniques. For selfadjoint operators a more direct con- 
struction of the measurable calculus can be given; see, for example, Rudin (1991), where 
it is used to give a simpler proof of the existence and uniqueness of square roots for pos- 
itive selfadjoint operators. 

The proof of the inclusion ‘C’ of Theorem 9.28 presented in Section 10.3.b is taken 
from Sz.-Nagy (1967). 


Chapter 11 


The connections between Functional Analysis and the theory of partial differential equa- 
tions are emphasised in Bressan (2013); Brezis (2011); Jost (2013). The results of this 
chapter barely scratch the surface of what can be said in this context. 

Sobolev spaces are treated in detail in Adams and Fournier (2003); Evans (2010). 
Some of our proofs are modelled after those presented in these references. Our pre- 
sentation of Propositions 11.5 and 11.16 follows Hytonen et al. (2016). The proof of 
Theorem 11.12 follows an idea of Krylov (2008). 

Extension operators are treated in Adams and Fournier (2003); Evans (2010). The 
proof of Step 1 of Theorem 11.28 is from Adams and Fournier (2003). Our proof of 
Theorem 11.27 is based on unpublished lecture notes by Mark Veraar. The theorem, 
which asserts the density of C*(D) in W*?(D) for bounded C*-domains D, actually 
gives the stronger result that for any f € W?(D) there exists a sequence of functions 
fn € C*(R2) whose restrictions to D satisfy limp. || fn — F\lwee(p) = 9. In this con- 
nection it is worth mentioning that if D is a bounded C*-domain, then every function 
f € C*(D) is the restriction of a function in C*(R¢); the analogous result holds for func- 
tions in C*(D) when D is a bounded C*-domain. In both cases, the extensions can be 
realised through a linear mapping. This result is due to Seeley (1964). 

The proof of Theorem 11.24 follows Arendt and Urban (2010) and Brezis (2011). The 
C!-conditions of the second part of the theorem can be relaxed; see Biegert and Warma 
(2006). If D is bounded and has C!-boundary 9D, then for 1 < p < © the mapping 
f + flap for f € C”(D) admits a unique extension to a bounded operator T, the trace 
operator, from W!?(D) to L?(dD). Here, we think of AD as being equipped with its 
surface measure. Moreover, for a function f € W!?(D) one has f € A P(D) if and 
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only if T f = 0. The details can be found in Adams and Fournier (2003); Brezis (2011); 
Evans (2010). 

It is not true in general that the weak solution of the Poisson problem —Au = f with 
f €L’(D) subject to Dirichlet boundary conditions belongs to H7(D); a counterexample 
can be found in Satz 6.89 of Arendt and Urban (2010). Proofs of the H?-regularity result 
mentioned in Remark 11.38 and its analogue for Neumann boundary conditions can be 
found in Chapter 6 of Evans (2010). 

Systematic treatments of the finite element method are presented in the monographs 
Atkinson and Han (2009); Brenner and Scott (2008). 

Problems 11.23 and 11.24 are taken from Krylov (2008) and reproduce Sobolev’s 
original proof of the inequality named after him. The outline of the proof, in Problem 
11.26, that for f € C.(D) no solution in C?(D) C(D) to the Poisson problem may exist, 
is taken from Arendt and Urban (2010). Problems 11.28 and 11.29 are modelled after 
the same reference. 


Chapter 12 


Excellent references for the theory of forms are Kato (1995); Ouhabaz (2005). Some of 
our proofs follow the latter reference. For the spectral theory of differential operators 
the reader is referred to Edmunds and Evans (2018a,b) and the references given therein, 
and, for variational methods, Henrot (2006). More complete treatments of Dirichlet and 
Neumann Laplacians can be found in the lecture notes Arendt (2006) and the survey 
papers Arendt (2004); Grebenkov and Nguyen (2013), where further references to the 
literature are given. A standard reference for the theory of elliptic second-order differ- 
ential operators is Gilbarg and Trudinger (2001). 

Our proof of Theorem 12.12 follows Arendt (2006). Further results along the lines of 
this theorem and its corollary can be found there and in Ouhabaz (2005). Among other 
things, under the assumptions of the corollary, D(A) is dense in D(a). 

In some of the results about the Neumann Laplacian, the C! assumption on the bound- 
ary can be relaxed. For example, the Neumann Laplacian has point spectrum if D has 
continuous boundary and D lies ‘on one side’ of it. The steps are as follows: If D a 
has continuous boundary, then W!?(D) is compactly embedded in L(D) according 
to Theorem V.4.17 of Edmunds and Evans (2018b) and the Neumann Laplacian has a 
compact resolvent. The assumption on the boundary in Theorem 12.26(2) and Theorem 
12.27 may be weakened accordingly. 

Kac’s question “Can one hear the shape of a drum?” was asked in Kac (1966) and 
answered to the negative in Gordon et al. (1992). Our presentation of Weyl’s theorem 
follows Higson (2004). An example of a Jordan curve of positive area is given in Osgood 


(1903). 
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The inequality U, <A, of Corollary 12.28 comparing the Dirichlet eigenvalues A, 
and the Neumann eigenvalues [, admits a significant improvement, due to Friedlander 
(1991) who proved that for all n > 1 we have 


Hn+1 < An. 
After reducing to smooth domains, an important step in the proof is the spectral flow 
inequality 
Nweum(A) — Npir(A) = (A), 
for A > 0 satisfying A ¢ o(—Apir) UO(—ANeum), where 


Npir(A) = #{An € O(—Apir) An <A}, 
Nneum(A) = #{An € O(—Aneum) ? An <A}, 


and n(A) is the number of negative eigenvalues of the Dirichlet-to-Neumann operator 
R,, counting multiplicities throughout. This is the operator on L?(D) which maps a 
function f € L?(D) to gu lap € L?(D), where u € H'!(D) is the unique solution of the 
problem 


—Au=Au_ onD, 
u=f  ondabD. 


It is selfadjoint, bounded below, and has compact resolvent. 

A simpler proof of Friedlander’s theorem, based on a variant of the Courant—Fischer 
theorem, was obtained by Filonov (2005). Levine and Weinberger (1986) obtained the 
inequality 


Un+d < An 


for bounded convex domains D in R4, with strict inequality when 0D is smooth. 

Weyl’s theorem has been extended to other types of boundary conditions, including 
Neumann boundary conditions, and positive selfadjoint elliptic operators. For more de- 
tails the reader is referred to Safarov and Vassilev (1997). Such extensions are nontrivial 
even for the Laplace operator because the domain monotonicity for Dirichlet eigenval- 
ues of Lemma 12.30 generally fails for boundary conditions other than Dirichlet. This is 
demonstrated by the following example, taken from Funano (2022). We use the notation 
a <b to express that a < Cb for a universal constant C. 

For 1 < p <2 let By denote the open unit ball of C° the space K¢ endowed with 


the norm given by ||x||5 = Yoo |x;|?. If the positive real number rz, is defined by the 
condition vol(rg, pByr) =1,thenrg, ~d 1/P_ The smallest Neumann eigenvalue for the 


Laplace operator on D! := rg. pByp can be shown to satisfy 


Hop 2 1 
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(keep in mind our convention that UW, p) = 0). Approximating the segment in D’ con- 
necting the origin and the point (ry_,,0,0,...,0) by a convex C!-domain D C D’, it can 
be shown that 


Te. v4 
pee ee ~ @Q2/P 


In the positive direction, in the same reference the following monotonicity result is 
proved: If D,D’ C IR? bounded convex sets with C! boundaries and if D C D’, then for 
alln > 1 we have 


Unb! Sa inp, n>1. 


The counterexample (upon letting p | 1) shows that the factor d? is essentially optimal. 
Problems 12.2—12.4 are taken from Arendt (2006). 


Chapter 13 


Excellent introductions to the theory of Co-semigroups include the monographs Apple- 
baum (2019); Davies (1980); Engel and Nagel (2000); Pazy (1983). For a discussion of 
the examples in Section 13.6 we refer to these sources. The monumental 1957 treatise 
Hille and Phillips (1957) is freely available online. 

Parts of Sections 13.1—13.4 and Figures 13.1 and 13.2 are taken from Appendix G of 
Hytonen et al. (2017), which in turn is based on the corresponding material in the au- 
thor’s lecture notes for the 2006/07 Internet Seminar “Stochastic Evolution Equations”, 
available on the author’s webpage. 

Theorem 13.11 is due to Phillips (1955). Theorem 13.16 is a special case of a result of 
Jorgensen (1982). The idea to use this result to prove Wiener’s tauberian theorem is from 
van Neerven (1997). Theorem 13.17 was obtained independently by Hille and Yosida 
near the end of the 1940s. An extension to arbitrary Co-semigroups, which is somewhat 
more technical to state, was found soon afterwards. A detailed account of Theorem 
13.17 and its history is given in Engel and Nagel (2000). The intimate connections 
between semigroups and the theory of Laplace transforms are emphasised in Arendt 
et al. (2011). 

Fuller treatments of the abstract Cauchy problem are given in Amann (1995); Arendt 
(2004); Tanabe (1979). 

Analytic semigroups are treated in detail in Lunardi (1995). Maximal regularity for 
bounded analytic Co-semigroups on Hilbert spaces was first proved in De Simon (1964). 
The result remains valid if L? is replaced by L? with 1 < p < throughout; this follows 
from rather deep extrapolation arguments for singular integral operators and falls out- 
side our scope. For a full treatment as well as references to the extensive literature on the 
subject the reader is referred to Hyt6nen et al. (2022+) whose treatment we follow. The 
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method of applying maximal regularity to solving time-dependent problems of Section 
13.4.d goes back to Clément and Li (1993/94) and has been extended to cover a wealth 
of other nonlinear problems. 

In the light of Example 13.41 (which is revisited at the end of this section) it is 
of some interest to mention that, in the converse direction, every generator —A of an 
analytic Co-semigroup of contractions on a complex Hilbert space H can be represented 
in divergence form, in the following precise sense: There exists a Hilbert space #, a 
closed operator V : D(V) C H + # with dense domain and dense range, and a bounded 
coercive operator B € £(.#), that is, we have (Bx|x) 4 > B||x|| ae for some B > 0 and 
all x € AH, such that 


A=V*BV. 


More precisely, there exists a densely defined, closed, sectorial form a in H with domain 
D(a) = D(V) such that A is the operator associated with a and 


a(g,h) =(BVg|Vh), g,heD(V). 


A proof of this result can be found in Maas and van Neerven (2009), where it is also 
shown that this representation is essentially unique. 

Our proofs of Theorem 13.50 and Lemma 13.54 are taken form Arendt (2006). 

The Ornstein—Uhlenbeck semigroup has many interesting properties, for which the 
reader is referred to Hyt6nen et al. (2023+); Janson (1997); Nualart (2006). Probabilis- 
tically, up to a time scaling it arises as the transition semigroup associated with the 
solution (u,(t)):50 of the stochastic differential equation 


1 
du(t) = — u(t) dt + dB, t>0, 


with initial condition u(0) = x; the driving process (B;);>0 is a standard Brownian mo- 
tion in R¢. More precisely, for all t > 0 and f € L?(R4,7) one has 


OU (t/2) f(x) = E(f(ux(t))) 


for almost all x € R¢. 

The domain identification D(L) = W?(R4, 7) for the Ornstein—Uhlenbeck operator 
Lin L?(R¢ ,Y) with 1 < p < is due, in a more general formulation, to Metafune 
et al. (2002). This paper also contains references to earlier papers on this subject, in 
particular regarding the special case p = 2. The Ornstein—Uhlenbeck semigroup extends 
to an analytic Co-contraction semigroup in L?(IR¢, ) for 1 < p < ©, with optimal angle 
6 — p given by 


cos Oy = |= -1) 
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This result is due to Epperson (1989), who also showed that the exact domain of holo- 
morphy is the set 


Ep := {z=x+iy €C: |sin(y)| < tan(@,) sinh(x)}. 


A simpler proof of the latter was given in van Neerven and Portal (2018). 

The L?-to-L? bound (13.36) for the free Schrédinger group (S(t));eg in Section 
13.6.g is an example of a so-called dispersive estimate. It informs us that initial data in 
L| (R¢) ML? (R¢) are mapped instantaneously (in forward and backward time) to L*(R®) 
and decay to 0 as |t| — co with respect to the norm of L7(R@) for all 2 < q < ». This 
bound lies at the basis of a class of deep estimates, named after Strichartz who proved 
an analogous estimate for the wave group, the simplest of which gives a bound for the 
L?(R;L4(R4)) norm of S(-)f for initial data f € L?(IR¢) and suitable exponents p,q. 
Such estimates, in turn, are the key to solving certain important classes of nonlinear 
Schrédinger equations. For a detailed treatment of these matters the reader is referred to 
the lecture notes Hundertmark et al. (2013) and the references cited there; an elementary 
introduction is presented in Stein and Shakarchi (2011). 

The argument in the first part of Section 13.6.h is taken from Hundertmark et al. 
(2013). 

The example in Problem 13.17 is due to Arendt (1995). 

If A is a densely defined operator acting in a Banach space with the property that 
(—2,0) € p(A) and 


sup |A|||(A+A) ||| <e, 
AE (0,00) 


then there exists a unique densely defined closed operator A!/2 such that 
(Al/?)? =A, 
Moreover, D(A) is dense in D(A'/?) and 
1 fo} 
Aly = a a2, 4A)lAxda, x € DA). 
0 


A proof of this result can be found in Section 3.8 of Arendt et al. (2011). In particular it 
applies to A = —B whenever B is the generator of a uniformly bounded Co-semigroup 
on X. The result should be compared to Proposition 10.58, where it was shown that if 
A is a positive selfadjoint operator acting in a Hilbert space, then A admits a unique 
positive square root A!/?. 

Let us now apply this to the divergence form operator A, := —div(aV) of Example 
13.41 associated with the sesquilinear form 


au(f.s)i= [av s¥e, fg eH'(R’), 
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where the (d x d)-matrix a = (a; ae j-1 is assumed to have bounded measurable real- 
valued coefficients satisfying the uniform ellipticity condition stated in the example. 
Since —A, generates an analytic Co-semigroup of contractions, by the above discussion 
the square root Au * is well defined. The Kato square root problem is to decide whether 
the domain equality 


D(Aa!*) = D(a) = H'(R4) 
holds, with equivalence of homogeneous norms 
II(-diva,V)"/ fl] = [IVF 


for all functions f in this common domain. Starting with the papers Kato (1961) and 
McIntosh (1982), this problem has witnessed a long and interesting history. It was fi- 
nally resolved in the affirmative, in the generality stated here, in Auscher et al. (2002). 
This paper also contains references to the various special cases that had been obtained 
before. An alternative proof based on the theory of bisectorial operators was obtained 
subsequently in Axelsson et al. (2006). 


Chapter 14 


Step | in our proof of the result of Section 14.5.b follows Theorem IV.3.1 in Gohberg 
and Goldberg (1981). Our treatment of Theorem 14.34 follows Dunford and Schwartz 
(1988b). Theorem 14.45 goes back to Mercer (1909). The proof of Theorem 14.46 and 
the Problems 14.16 and 14.16 are taken from Murphy (1994). Theorem 14.47 is from 
Helton and Howe (1973). The results of Section 14.4 are taken from Attal (2013). 

Our proof of Lidskii’s theorem is due to Simon (1977), whose arguments we follow 
here. A survey of the connections between determinants and traces, containing a proof 
of MacMahon’s formula as well as a treatment of Fredholm determinants, is Cartier 
(1989). A proof of Lemma 14.43 based on a theorem of Borel and Carathéodory, is 
given in Simon (1977). For much more on this topic the reader may consult Simon 
(2005). 

For positive kernel operators, Theorem 14.45 is due to Mercer (1909). The extension 
to general integral operators of trace class is taken from Birman (1989). In that paper it 
is also shown how to extend the result to general measure spaces as long as its L? space 
is separable. Further interesting results on this topic can be found in Brislawn (1988). It 
is of interest to note that not every integral operator with continuous kernel is of trace 
class; a classical counterexample can be found in Carleman (1916). 

The derivation of Euler’s formula from the trace of the Dirichlet Laplacian on the 
interval is taken from Grieser (2007). 

The proof outlined in Problem 14.7 is taken from Arendt (2006), where it is attributed 
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to Markus Haase. Problem 14.15 is taken from Helton and Howe (1973) and Problems 
14.16 and 14.17 are from Murphy (1994). 


Chapter 15 


Historical aspects of the interaction between Functional Analysis and Quantum Me- 
chanics are well recorded in Landsman (2019). An excellent modern introduction to 
Quantum Mechanics from the mathematician’s point of view is Hall (2013). More ad- 
vanced treatments are offered in Landsman (1998, 2017); Mackey (1968); Parthasarathy 
(2005); Takhtajan (2008). 

The mathematical formulation of Quantum Mechanics using the language of Hilbert 
space theory is due to von Neumann (1968). Ever since the publication of this work 
in 1932, physicists, mathematicians, and philosophers have wondered as to why Na- 
ture made that choice by looking for deeper criteria characterising Hilbert spaces. A 
first important step in this direction was taken in Piron (1964) and Amemiya and Araki 
(1966/1967) in the 1960s, who proved that a complex inner product space is a Hilbert 
space if and only if it is orthomodular. By definition, an inner product space H is or- 
thomodular if Y + Y+ = H for every closed subspace of H. The theorem of Piron and 
Amemiya—Araki was extended by several mathematicians to inner product spaces over 
R, C, and H, the field of quaternions. The definitive result in this direction was proved 
in Solér (1995). In order to state her result we need the following terminology. 

Let H be a vector space over a field K. A Hermitian form on H is a mapping (-|-) : H x 
H — K satisfying the axioms of an inner product except the requirement that (x|x) = 
0 should imply x = 0. A Hermitian vector space is a vector space endowed with a 
Hermitian form. A subspace Y of a Hermitian vector space H is called closed if Y++ = 
Y, orthogonal complements being defined in the obvious way using the Hermitian form. 
A Hermitian vector space H is called orthomodular if Y + Y+ = H for every closed 
subspace Y of H. A field K is called a x-field if it admits an involution, that is, a mapping 
c+ c* from K onto itself satisfying (c) +c2)* = c} +c}, (c1c2)* = chcj, and c** =c 
for all cj,c2,c € K). 

Now we are ready to state Solér’s theorem: If H is a Hermitian vector space over a 
x-field KK admitting an infinite orthonormal sequence (orthonormality being defined in 
the obvious way using the Hermitian form), then: 


e K equals R, C, or H; 
e the Hermitian form is an inner product; 
e #H isa Hilbert space over K. 


A survey of Solér’s theorem is given in Holland (1995). Very recently, the theorem was 
used in Heunen and Kornell (2022) to give a characterisation of the category of Hilbert 
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spaces as the unique category (in the sense of category theory) satisfying certain natural 
category theoretical axioms. 

Modern treatments replace the language of operator theory on Hilbert spaces by that 
of C*-algebras. By a theorem of Gelfand, Naimark, and Segal (see, for example, Rudin 
(1991)), every closed x-subalgebra can be represented as a x-subalgebra of &(H) for 
an appropriate Hilbert space H, so not much seems to be gained. The advantage of 
this approach, however, is that it covers both the classical and the quantum settings: 
by a theorem due to Gelfand, every commutative C*-algebra can be represented as a 
space Co(Q) for some locally compact Hausdorff space Q, and by C(K) for some com- 
pact Hausdorff space K if the C*-algebra has a unit. In this precise sense, the ‘classical 
world’ is commutative, while the ‘quantum world’ in noncommutative. Comprehensive 
treatments of C*-algebras and states defined on them are offered in Blackadar (2006); 
Bratteli and Robinson (1987); Pedersen (2018); Takesaki (2002). 

Proofs of Gleason’s theorem mentioned at the end of Section 15.2.a can be found in 
Landsman (2017); Parthasarathy (2005). 

Our proof of Theorem 15.29 is taken form Akhiezer and Glazman (1981b). Another 
proof can be derived from Stinespring’s dilation theorem. This approach is presented 
in Han et al. (2014), which may be consulted for more on (positive) operator-valued 
measures. Older references on the subject are Berberian (1966); Davies (1976); Holevo 
(2011); Landsman (1998). For a detailed discussion and examples of unsharp observ- 
ables the reader is referred to Busch et al. (1995), which is also the source for the results 
of Section 15.3.d. The phase POVM ® introduced in this section was studied in Garrison 
and Wong (1970). 

Let us now sketch an elegant proof of Naimark’s theorem based on Stinespring’s 
theorem. We leave out some details which can be found in Stinespring (1955); see also 
Paulsen (2002). Let Q be a POVM on (Q,.#) and let Yo : By(Q) > @(H) be the 
bounded functional calculus of Proposition 15.27. The crucial observation is that every 
bounded operator V : B,(Q) + @(H) is completely positive, that is, for alln = 1,2,... 
and all fi,... fn € Bp(Q) and /1,...,4, € H we have 


n 


Y (YF Fedhjlhe) > 0 


ik=1 
By Gelfand’s theorem there is no loss of generality in assuming that Q is a compact 


Hausdorff space and that -¥ is its Borel o-algebra. Fixing an integer n > 1, by the Riesz 
representation theorem we find a finite Borel measure 1 on Q such that 


Le g)h;\hj) = [ eau, geEC(Q). 


~. 
UR 
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By the Radon—Nikodym theorem there exist functions h jz € L!(Q,) such that 
((s)hlh) = f shud, @€C(). 


One then checks that the matrix (A i) ea is positive U-almost everywhere. Also the 
matrix (f, iti) <1 18 positive U-almost everywhere. It follows that it f, fc jk 29 
L-almost everywhere and therefore 


n 
k= 
as was to be shown. Now (a special case of) Stinespring’s theorem asserts that every 


completely positive bounded mapping V : B,(Q.) > (H) satisfying ||‘Y¥(1)|| = 1 is of 
the form 


J 


(LT M) =f Y LiFihwdy > 0, 


1 jk=l 


¥(f) =S(S)J, 


where J is an isometry from H to a Hilbert space H and IT: By)(Q) + Y(H) is a 
x-homomorphism. Applying this to Yo and restricting IT to indicator functions, Propo- 
sition 15.24 gives us the desired projection-valued measure. 

Physically, the qubit corresponds to the 2-dimensional irreducible unitary representa- 
tion of SU(2) and as such it models a spin-5 particle. For every n € N, SU(2) admits an 
irreducible representation which acts on C’*! and represents a spin-3n particle (which 
is a boson if 7 is even and a fermion if n is odd). More on this topic can be found in 
Sternberg (1994); Woit (2017). 

A complete proof of Theorem 15.33, including a proof of the algebraic fact that was 
used in our proof for the qubit case, is given in Landsman (2017). Our proof for the qubit 
case is extracted from it. Bargmann’s theorem mentioned in the text is in Bargmann 
(1954); a complete proof is also found in Parthasarathy (2005). 

Theorem 15.32 is a straightforward generalisation of the hidden variable result of 
Holevo (2011), where also the resulting hidden variable model for the qubit is derived. 
The existence of hidden variables for the qubit was first observed by Bell (1966). There 
is an extensive literature on the nonexistence of hidden variables, but such results usually 
work with more restrictive notions of hidden variables. A discussion of these results can 
be found in Landsman (2017). 

For introductions to Lie groups and LCA groups we recommend Folland (2016). 
More complete treatments of covariance lead to the notion of systems of imprimitivity 
studied in Mackey (1968). For in-depth discussions of covariance and the way it pins 
down observables we recommend Parthasarathy (2005); Varadarajan (1985). A discus- 
sion from the physicist point of view is given in Busch et al. (1995). Theorem 15.38 
is a special case of a generalisation of Stone’s theorem (Theorem 13.46) for arbitrary 
strongly continuous unitary representations of G; see Theorem 4.5 in Folland (2016). 
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The presentation of the Stone-von Neumann theorem follows Folland (1989) and 
Hall (2013). The theorem admits a generalisation to LCA groups, essentially due to 
Mackey. For modern references to the literature on this subject the reader is referred 
to the survey article Rosenberg (2004). The formula for the Ornstein—Uhlenbeck semi- 
group in Theorem 15.55 goes back, at least, to Unterberger (1979); in its present form 
it is taken from van Neerven and Portal (2018). 

Treatments of second quantisation can be found in Janson (1997); Parthasarathy 
(1992); Simon (1974). For a discussion from the Physics perspective we recommend 
Talagrand (2022). The proof of Theorem 15.69 is taken from Simon (1974). Theorems 
15.70 and 15.71 are due to Segal (1956). Our discussion of the position and momen- 
tum operators follows Parthasarathy (1992), except that we use different normalisations 
designed to arrive at the physicist’s identities (15.44) and (15.45) for the quantum har- 
monic oscillator. Proposition 15.74 can be found in the notes to Chapter | of Nualart 
(2006). As mentioned in the text, most results in Section 15.6 generalise to infinite di- 
mensions if one replaces the Gaussian measure y on R@ by a so-called H-isonormal 
process defined on a probability space (Q,P), where H is a real Hilbert space taking 
the role of R% The resulting theory has deep connections with the theory of stochastic 
integration; see, for example, Nualart (2006). 

Let us finish with describing an interesting connection with Number Theory. Roughly 
speaking it says that, spectrally, the positive integers are precisely the second quantised 
primes. The starting point to make this into a rigorous statement one is a theorem of 
Brown and Pearcy (1966) that if T is a bounded operator on a Hilbert space H, then the 
spectrum of its n-fold tensor product T®” acting on the Hilbert space H®” equals 


o(T°") = {Ap--An: Aj G(T), I= 1,...,n}. 


If ||7'|| < 1, by taking direct sums one arrives at the formula 


o( Br") = fara: Aj € O(1) for 1< F <a n> 1} 


neN neN 


with contribution 1 for the spectrum of T®° := J. Now let P = {2,3,5,7,11,...} be 
the set of primes and consider the Hilbert space (7(P). Denoting the standard unit basis 
vectors of this space by é2,e3,é5,... we consider the contraction 


1 
Tiept>—ep, peP. 
P 
Then ||7'|| = 5, 


o(T) = {5 : p € P}UL{0}, 
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and accordingly 


1 
@n : 
ro eee > 
(@r ) {- ne€N,n> 1} {0}, 
neN 
with each point 7 being a simple pole, thanks to the uniqueness of prime factorisation. 
This observation (which extends to the symmetric second quantisation of 7), as well as 
deeper connections, can be found in Bost and Connes (1995); Connes (1994). 


Appendices 


Most of the material is standard; some proofs are taken from Folland (1999); Kallenberg 
(2002); Ryan (2002). Zorn’s lemma is equivalent with the Axiom of Choice. A proof of 
this fact and further equivalences can be found in Jech (1973, 2003); Rubin and Rubin 
(1970). The proof of Tychonov’s theorem follows Ruzhansky and Turunen (2010). The 
treatment of Carathéodory’s treatment is based on lecture notes by Mark Veraar. 


Credits 


John von Neumann’s picture is reprinted with permission from George Karger/The 
Chronicle Collection/Getty Images. All other pictures are in the public domain. 
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